-
Know Your Neighborhood: General and Zero-Shot Capable Binary Function Search Powered by Call Graphlets
Authors:
Joshua Collyer,
Tim Watson,
Iain Phillips
Abstract:
Binary code similarity detection is an important problem with applications in areas like malware analysis, vulnerability research and plagiarism detection. This paper proposes a novel graph neural network architecture combined with a novel graph data representation called call graphlets. A call graphlet encodes the neighborhood around each function in a binary executable, capturing the local and g…
▽ More
Binary code similarity detection is an important problem with applications in areas like malware analysis, vulnerability research and plagiarism detection. This paper proposes a novel graph neural network architecture combined with a novel graph data representation called call graphlets. A call graphlet encodes the neighborhood around each function in a binary executable, capturing the local and global context through a series of statistical features. A specialized graph neural network model is then designed to operate on this graph representation, learning to map it to a feature vector that encodes semantic code similarities using deep metric learning. The proposed approach is evaluated across four distinct datasets covering different architectures, compiler toolchains, and optimization levels. Experimental results demonstrate that the combination of call graphlets and the novel graph neural network architecture achieves state-of-the-art performance compared to baseline techniques across cross-architecture, mono-architecture and zero shot tasks. In addition, our proposed approach also performs well when evaluated against an out-of-domain function inlining task. Overall, the work provides a general and effective graph neural network-based solution for conducting binary code similarity detection.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
Canaries and Whistles: Resilient Drone Communication Networks with (or without) Deep Reinforcement Learning
Authors:
Chris Hicks,
Vasilios Mavroudis,
Myles Foley,
Thomas Davies,
Kate Highnam,
Tim Watson
Abstract:
Communication networks able to withstand hostile environments are critically important for disaster relief operations. In this paper, we consider a challenging scenario where drones have been compromised in the supply chain, during their manufacture, and harbour malicious software capable of wide-ranging and infectious disruption. We investigate multi-agent deep reinforcement learning as a tool fo…
▽ More
Communication networks able to withstand hostile environments are critically important for disaster relief operations. In this paper, we consider a challenging scenario where drones have been compromised in the supply chain, during their manufacture, and harbour malicious software capable of wide-ranging and infectious disruption. We investigate multi-agent deep reinforcement learning as a tool for learning defensive strategies that maximise communications bandwidth despite continual adversarial interference. Using a public challenge for learning network resilience strategies, we propose a state-of-the-art expert technique and study its superiority over deep reinforcement learning agents. Correspondingly, we identify three specific methods for improving the performance of our learning-based agents: (1) ensuring each observation contains the necessary information, (2) using expert agents to provide a curriculum for learning, and (3) paying close attention to reward. We apply our methods and present a new mixed strategy enabling expert and learning-based agents to work together and improve on all prior results.
△ Less
Submitted 8 December, 2023;
originally announced December 2023.
-
FASER: Binary Code Similarity Search through the use of Intermediate Representations
Authors:
Josh Collyer,
Tim Watson,
Iain Phillips
Abstract:
Being able to identify functions of interest in cross-architecture software is useful whether you are analysing for malware, securing the software supply chain or conducting vulnerability research. Cross-Architecture Binary Code Similarity Search has been explored in numerous studies and has used a wide range of different data sources to achieve its goals. The data sources typically used draw on c…
▽ More
Being able to identify functions of interest in cross-architecture software is useful whether you are analysing for malware, securing the software supply chain or conducting vulnerability research. Cross-Architecture Binary Code Similarity Search has been explored in numerous studies and has used a wide range of different data sources to achieve its goals. The data sources typically used draw on common structures derived from binaries such as function control flow graphs or binary level call graphs, the output of the disassembly process or the outputs of a dynamic analysis approach. One data source which has received less attention is binary intermediate representations. Binary Intermediate representations possess two interesting properties: they are cross architecture by their very nature and encode the semantics of a function explicitly to support downstream usage. Within this paper we propose Function as a String Encoded Representation (FASER) which combines long document transformers with the use of intermediate representations to create a model capable of cross architecture function search without the need for manual feature engineering, pre-training or a dynamic analysis step. We compare our approach against a series of baseline approaches for two tasks; A general function search task and a targeted vulnerability search task. Our approach demonstrates strong performance across both tasks, performing better than all baseline approaches.
△ Less
Submitted 29 November, 2023; v1 submitted 5 October, 2023;
originally announced October 2023.
-
Observation of high-energy neutrinos from the Galactic plane
Authors:
R. Abbasi,
M. Ackermann,
J. Adams,
J. A. Aguilar,
M. Ahlers,
M. Ahrens,
J. M. Alameddine,
A. A. Alves Jr.,
N. M. Amin,
K. Andeen,
T. Anderson,
G. Anton,
C. Argüelles,
Y. Ashida,
S. Athanasiadou,
S. Axani,
X. Bai,
A. Balagopal V.,
S. W. Barwick,
V. Basu,
S. Baur,
R. Bay,
J. J. Beatty,
K. -H. Becker,
J. Becker Tjus
, et al. (364 additional authors not shown)
Abstract:
The origin of high-energy cosmic rays, atomic nuclei that continuously impact Earth's atmosphere, has been a mystery for over a century. Due to deflection in interstellar magnetic fields, cosmic rays from the Milky Way arrive at Earth from random directions. However, near their sources and during propagation, cosmic rays interact with matter and produce high-energy neutrinos. We search for neutrin…
▽ More
The origin of high-energy cosmic rays, atomic nuclei that continuously impact Earth's atmosphere, has been a mystery for over a century. Due to deflection in interstellar magnetic fields, cosmic rays from the Milky Way arrive at Earth from random directions. However, near their sources and during propagation, cosmic rays interact with matter and produce high-energy neutrinos. We search for neutrino emission using machine learning techniques applied to ten years of data from the IceCube Neutrino Observatory. We identify neutrino emission from the Galactic plane at the 4.5$σ$ level of significance, by comparing diffuse emission models to a background-only hypothesis. The signal is consistent with modeled diffuse emission from the Galactic plane, but could also arise from a population of unresolved point sources.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
LYSTO: The Lymphocyte Assessment Hackathon and Benchmark Dataset
Authors:
Yi** Jiao,
Jeroen van der Laak,
Shadi Albarqouni,
Zhang Li,
Tao Tan,
Abhir Bhalerao,
Jiabo Ma,
Jiamei Sun,
Johnathan Pocock,
Josien P. W. Pluim,
Navid Alemi Koohbanani,
Raja Muhammad Saad Bashir,
Shan E Ahmed Raza,
Sibo Liu,
Simon Graham,
Suzanne Wetstein,
Syed Ali Khurram,
Thomas Watson,
Nasir Rajpoot,
Mitko Veta,
Francesco Ciompi
Abstract:
We introduce LYSTO, the Lymphocyte Assessment Hackathon, which was held in conjunction with the MICCAI 2019 Conference in Shenzen (China). The competition required participants to automatically assess the number of lymphocytes, in particular T-cells, in histopathological images of colon, breast, and prostate cancer stained with CD3 and CD8 immunohistochemistry. Differently from other challenges se…
▽ More
We introduce LYSTO, the Lymphocyte Assessment Hackathon, which was held in conjunction with the MICCAI 2019 Conference in Shenzen (China). The competition required participants to automatically assess the number of lymphocytes, in particular T-cells, in histopathological images of colon, breast, and prostate cancer stained with CD3 and CD8 immunohistochemistry. Differently from other challenges setup in medical image analysis, LYSTO participants were solely given a few hours to address this problem. In this paper, we describe the goal and the multi-phase organization of the hackathon; we describe the proposed methods and the on-site results. Additionally, we present post-competition results where we show how the presented methods perform on an independent set of lung cancer slides, which was not part of the initial competition, as well as a comparison on lymphocyte assessment between presented methods and a panel of pathologists. We show that some of the participants were capable to achieve pathologist-level performance at lymphocyte assessment. After the hackathon, LYSTO was left as a lightweight plug-and-play benchmark dataset on grand-challenge website, together with an automatic evaluation platform. LYSTO has supported a number of research in lymphocyte assessment in oncology. LYSTO will be a long-lasting educational challenge for deep learning and digital pathology, it is available at https://lysto.grand-challenge.org/.
△ Less
Submitted 13 April, 2023; v1 submitted 16 January, 2023;
originally announced January 2023.
-
Graph Neural Networks for Low-Energy Event Classification & Reconstruction in IceCube
Authors:
R. Abbasi,
M. Ackermann,
J. Adams,
N. Aggarwal,
J. A. Aguilar,
M. Ahlers,
M. Ahrens,
J. M. Alameddine,
A. A. Alves Jr.,
N. M. Amin,
K. Andeen,
T. Anderson,
G. Anton,
C. Argüelles,
Y. Ashida,
S. Athanasiadou,
S. Axani,
X. Bai,
A. Balagopal V.,
M. Baricevic,
S. W. Barwick,
V. Basu,
R. Bay,
J. J. Beatty,
K. -H. Becker
, et al. (359 additional authors not shown)
Abstract:
IceCube, a cubic-kilometer array of optical sensors built to detect atmospheric and astrophysical neutrinos between 1 GeV and 1 PeV, is deployed 1.45 km to 2.45 km below the surface of the ice sheet at the South Pole. The classification and reconstruction of events from the in-ice detectors play a central role in the analysis of data from IceCube. Reconstructing and classifying events is a challen…
▽ More
IceCube, a cubic-kilometer array of optical sensors built to detect atmospheric and astrophysical neutrinos between 1 GeV and 1 PeV, is deployed 1.45 km to 2.45 km below the surface of the ice sheet at the South Pole. The classification and reconstruction of events from the in-ice detectors play a central role in the analysis of data from IceCube. Reconstructing and classifying events is a challenge due to the irregular detector geometry, inhomogeneous scattering and absorption of light in the ice and, below 100 GeV, the relatively low number of signal photons produced per event. To address this challenge, it is possible to represent IceCube events as point cloud graphs and use a Graph Neural Network (GNN) as the classification and reconstruction method. The GNN is capable of distinguishing neutrino events from cosmic-ray backgrounds, classifying different neutrino event types, and reconstructing the deposited energy, direction and interaction vertex. Based on simulation, we provide a comparison in the 1-100 GeV energy range to the current state-of-the-art maximum likelihood techniques used in current IceCube analyses, including the effects of known systematic uncertainties. For neutrino event classification, the GNN increases the signal efficiency by 18% at a fixed false positive rate (FPR), compared to current IceCube methods. Alternatively, the GNN offers a reduction of the FPR by over a factor 8 (to below half a percent) at a fixed signal efficiency. For the reconstruction of energy, direction, and interaction vertex, the resolution improves by an average of 13%-20% compared to current maximum likelihood techniques in the energy range of 1-30 GeV. The GNN, when run on a GPU, is capable of processing IceCube events at a rate nearly double of the median IceCube trigger rate of 2.7 kHz, which opens the possibility of using low energy neutrinos in online searches for transient events.
△ Less
Submitted 11 October, 2022; v1 submitted 7 September, 2022;
originally announced September 2022.
-
Virtual reality (VR) as a testing bench for consumer optical solutions: A machine learning approach (GBR) to visual comfort under simulated progressive addition lenses (PALS) distortions
Authors:
Miguel García García,
Yannick Sauer,
Tamara Watson,
Siegfried Wahl
Abstract:
For decades, manufacturers have attempted to reduce or eliminate the optical aberrations that appear on the progressive addition lens' surfaces during manufacturing. Besides every effort made, some of these distortions are inevitable given how lenses are fabricated, where in fact, astigmatism appears on the surface and cannot be entirely removed or where non-uniform magnification becomes inherent…
▽ More
For decades, manufacturers have attempted to reduce or eliminate the optical aberrations that appear on the progressive addition lens' surfaces during manufacturing. Besides every effort made, some of these distortions are inevitable given how lenses are fabricated, where in fact, astigmatism appears on the surface and cannot be entirely removed or where non-uniform magnification becomes inherent to the power change across the lens. Some presbyopes may refer to certain discomfort when wearing these lenses for the first time, and a subset of them might never adapt. Develo**, prototy**, testing and purveying those lenses into the market come at a cost, which is usually reflected in the retail price. This study aims to test the feasibility of virtual reality for testing customers' satisfaction with these lenses, even before getting them onto production. VR offers a controlled environment where different parameters affecting progressive lens comforts, such as distortions, image displacement or optical blurring, can be analysed separately. In this study, the focus was set on the distortions and image displacement, not taking blur into account. Behavioural changes (head and eye movements) were recorded using the built-in eye tracker. Participants were significantly more displeased in the presence of highly distorted lens simulations. In addition, a gradient boosting regressor was fitted to the data, so predictors of discomfort could be unveiled, and ratings could be predicted without performing additional measurements.
△ Less
Submitted 14 July, 2022;
originally announced July 2022.
-
Erdős-Selfridge Theorem for Nonmonotone CNFs
Authors:
Md Lutfar Rahman,
Thomas Watson
Abstract:
In an influential paper, Erdős and Selfridge introduced the Maker-Breaker game played on a hypergraph, or equivalently, on a monotone CNF. The players take turns assigning values to variables of their choosing, and Breaker's goal is to satisfy the CNF, while Maker's goal is to falsify it. The Erdős-Selfridge Theorem says that the least number of clauses in any monotone CNF with $k$ literals per cl…
▽ More
In an influential paper, Erdős and Selfridge introduced the Maker-Breaker game played on a hypergraph, or equivalently, on a monotone CNF. The players take turns assigning values to variables of their choosing, and Breaker's goal is to satisfy the CNF, while Maker's goal is to falsify it. The Erdős-Selfridge Theorem says that the least number of clauses in any monotone CNF with $k$ literals per clause where Maker has a winning strategy is $Θ(2^k)$.
We study the analogous question when the CNF is not necessarily monotone. We prove bounds of $Θ(\sqrt{2}\,^k)$ when Maker plays last, and $Ω(1.5^k)$ and $O(r^k)$ when Breaker plays last, where $r=(1+\sqrt{5})/2\approx 1.618$ is the golden ratio.
△ Less
Submitted 3 January, 2022;
originally announced January 2022.
-
Learning to Control DC Motor for Micromobility in Real Time with Reinforcement Learning
Authors:
Bibek Poudel,
Thomas Watson,
Weizi Li
Abstract:
Autonomous micromobility has been attracting the attention of researchers and practitioners in recent years. A key component of many micro-transport vehicles is the DC motor, a complex dynamical system that is continuous and non-linear. Learning to quickly control the DC motor in the presence of disturbances and uncertainties is desired for various applications that require robustness and stabilit…
▽ More
Autonomous micromobility has been attracting the attention of researchers and practitioners in recent years. A key component of many micro-transport vehicles is the DC motor, a complex dynamical system that is continuous and non-linear. Learning to quickly control the DC motor in the presence of disturbances and uncertainties is desired for various applications that require robustness and stability. Techniques to accomplish this task usually rely on a mathematical system model, which is often insufficient to anticipate the effects of time-varying and interrelated sources of non-linearities. While some model-free approaches have been successful at the task, they rely on massive interactions with the system and are trained in specialized hardware in order to fit a highly parameterized controller. In this work, we learn to steer a DC motor via sample-efficient reinforcement learning. Using data collected from hardware interactions in the real world, we additionally build a simulator to experiment with a wide range of parameters and learning strategies. With the best parameters found, we learn an effective control policy in one minute and 53 seconds on a simulation and in 10 minutes and 35 seconds on a physical system.
△ Less
Submitted 30 July, 2022; v1 submitted 30 July, 2021;
originally announced August 2021.
-
A Convolutional Neural Network based Cascade Reconstruction for the IceCube Neutrino Observatory
Authors:
R. Abbasi,
M. Ackermann,
J. Adams,
J. A. Aguilar,
M. Ahlers,
M. Ahrens,
C. Alispach,
A. A. Alves Jr.,
N. M. Amin,
R. An,
K. Andeen,
T. Anderson,
I. Ansseau,
G. Anton,
C. Argüelles,
S. Axani,
X. Bai,
A. Balagopal V.,
A. Barbano,
S. W. Barwick,
B. Bastian,
V. Basu,
V. Baum,
S. Baur,
R. Bay
, et al. (343 additional authors not shown)
Abstract:
Continued improvements on existing reconstruction methods are vital to the success of high-energy physics experiments, such as the IceCube Neutrino Observatory. In IceCube, further challenges arise as the detector is situated at the geographic South Pole where computational resources are limited. However, to perform real-time analyses and to issue alerts to telescopes around the world, powerful an…
▽ More
Continued improvements on existing reconstruction methods are vital to the success of high-energy physics experiments, such as the IceCube Neutrino Observatory. In IceCube, further challenges arise as the detector is situated at the geographic South Pole where computational resources are limited. However, to perform real-time analyses and to issue alerts to telescopes around the world, powerful and fast reconstruction methods are desired. Deep neural networks can be extremely powerful, and their usage is computationally inexpensive once the networks are trained. These characteristics make a deep learning-based approach an excellent candidate for the application in IceCube. A reconstruction method based on convolutional architectures and hexagonally shaped kernels is presented. The presented method is robust towards systematic uncertainties in the simulation and has been tested on experimental data. In comparison to standard reconstruction methods in IceCube, it can improve upon the reconstruction accuracy, while reducing the time necessary to run the reconstruction by two to three orders of magnitude.
△ Less
Submitted 26 July, 2021; v1 submitted 27 January, 2021;
originally announced January 2021.
-
When Is Amplification Necessary for Composition in Randomized Query Complexity?
Authors:
Shalev Ben-David,
Mika Göös,
Robin Kothari,
Thomas Watson
Abstract:
Suppose we have randomized decision trees for an outer function $f$ and an inner function $g$. The natural approach for obtaining a randomized decision tree for the composed function $(f\circ g^n)(x^1,\ldots,x^n)=f(g(x^1),\ldots,g(x^n))$ involves amplifying the success probability of the decision tree for $g$, so that a union bound can be used to bound the error probability over all the coordinate…
▽ More
Suppose we have randomized decision trees for an outer function $f$ and an inner function $g$. The natural approach for obtaining a randomized decision tree for the composed function $(f\circ g^n)(x^1,\ldots,x^n)=f(g(x^1),\ldots,g(x^n))$ involves amplifying the success probability of the decision tree for $g$, so that a union bound can be used to bound the error probability over all the coordinates. The amplification introduces a logarithmic factor cost overhead. We study the question: When is this log factor necessary? We show that when the outer function is parity or majority, the log factor can be necessary, even for models that are more powerful than plain randomized decision trees. Our results are related to, but qualitatively strengthen in various ways, known results about decision trees with noisy inputs.
△ Less
Submitted 19 June, 2020;
originally announced June 2020.
-
Reward Sha** for Human Learning via Inverse Reinforcement Learning
Authors:
Mark A. Rucker,
Layne T. Watson,
Matthew S. Gerber,
Laura E. Barnes
Abstract:
Humans are spectacular reinforcement learners, constantly learning from and adjusting to experience and feedback. Unfortunately, this doesn't necessarily mean humans are fast learners. When tasks are challenging, learning can become unacceptably slow. Fortunately, humans do not have to learn tabula rasa, and learning speed can be greatly increased with learning aids. In this work we validate a new…
▽ More
Humans are spectacular reinforcement learners, constantly learning from and adjusting to experience and feedback. Unfortunately, this doesn't necessarily mean humans are fast learners. When tasks are challenging, learning can become unacceptably slow. Fortunately, humans do not have to learn tabula rasa, and learning speed can be greatly increased with learning aids. In this work we validate a new type of learning aid -- reward sha** for humans via inverse reinforcement learning (IRL). The goal of this aid is to increase the speed with which humans can learn good policies for specific tasks. Furthermore this approach compliments alternative machine learning techniques such as safety features that try to prevent individuals from making poor decisions. To achieve our results we first extend a well known IRL algorithm via kernel methods. Afterwards we conduct two human subjects experiments using an online game where players have limited time to learn a good policy. We show with statistical significance that players who receive our learning aid are able to approach desired policies more quickly than the control group.
△ Less
Submitted 15 December, 2022; v1 submitted 25 February, 2020;
originally announced February 2020.
-
Query-to-Communication Lifting for BPP
Authors:
Mika Göös,
Toniann Pitassi,
Thomas Watson
Abstract:
For any $n$-bit boolean function $f$, we show that the randomized communication complexity of the composed function $f\circ g^n$, where $g$ is an index gadget, is characterized by the randomized decision tree complexity of $f$. In particular, this means that many query complexity separations involving randomized models (e.g., classical vs. quantum) automatically imply analogous separations in comm…
▽ More
For any $n$-bit boolean function $f$, we show that the randomized communication complexity of the composed function $f\circ g^n$, where $g$ is an index gadget, is characterized by the randomized decision tree complexity of $f$. In particular, this means that many query complexity separations involving randomized models (e.g., classical vs. quantum) automatically imply analogous separations in communication complexity.
△ Less
Submitted 22 March, 2017;
originally announced March 2017.
-
Extension Complexity of Independent Set Polytopes
Authors:
Mika Göös,
Rahul Jain,
Thomas Watson
Abstract:
We exhibit an $n$-node graph whose independent set polytope requires extended formulations of size exponential in $Ω(n/\log n)$. Previously, no explicit examples of $n$-dimensional $0/1$-polytopes were known with extension complexity larger than exponential in $Θ(\sqrt{n})$. Our construction is inspired by a relatively little-known connection between extended formulations and (monotone) circuit de…
▽ More
We exhibit an $n$-node graph whose independent set polytope requires extended formulations of size exponential in $Ω(n/\log n)$. Previously, no explicit examples of $n$-dimensional $0/1$-polytopes were known with extension complexity larger than exponential in $Θ(\sqrt{n})$. Our construction is inspired by a relatively little-known connection between extended formulations and (monotone) circuit depth.
△ Less
Submitted 24 April, 2016;
originally announced April 2016.
-
Nonnegative Rank vs. Binary Rank
Authors:
Thomas Watson
Abstract:
Motivated by (and using tools from) communication complexity, we investigate the relationship between the following two ranks of a $0$-$1$ matrix: its nonnegative rank and its binary rank (the $\log$ of the latter being the unambiguous nondeterministic communication complexity). We prove that for partial $0$-$1$ matrices, there can be an exponential separation. For total $0$-$1$ matrices, we show…
▽ More
Motivated by (and using tools from) communication complexity, we investigate the relationship between the following two ranks of a $0$-$1$ matrix: its nonnegative rank and its binary rank (the $\log$ of the latter being the unambiguous nondeterministic communication complexity). We prove that for partial $0$-$1$ matrices, there can be an exponential separation. For total $0$-$1$ matrices, we show that if the nonnegative rank is at most $3$ then the two ranks are equal, and we show a separation by exhibiting a matrix with nonnegative rank $4$ and binary rank $5$, as well as a family of matrices for which the binary rank is $4/3$ times the nonnegative rank.
△ Less
Submitted 24 March, 2016;
originally announced March 2016.
-
Lift-and-Project Integrality Gaps for the Traveling Salesperson Problem
Authors:
Thomas Watson
Abstract:
We study the lift-and-project procedures of Lov{á}sz-Schrijver and Sherali-Adams applied to the standard linear programming relaxation of the traveling salesperson problem with triangle inequality. For the asymmetric TSP tour problem, Charikar, Goemans, and Karloff (FOCS 2004) proved that the integrality gap of the standard relaxation is at least 2. We prove that after one round of the Lov{á}sz-Sc…
▽ More
We study the lift-and-project procedures of Lov{á}sz-Schrijver and Sherali-Adams applied to the standard linear programming relaxation of the traveling salesperson problem with triangle inequality. For the asymmetric TSP tour problem, Charikar, Goemans, and Karloff (FOCS 2004) proved that the integrality gap of the standard relaxation is at least 2. We prove that after one round of the Lov{á}sz-Schrijver or Sherali-Adams procedures, the integrality gap of the asymmetric TSP tour problem is at least 3/2, with a small caveat on which version of the standard relaxation is used. For the symmetric TSP tour problem, the integrality gap of the standard relaxation is known to be at least 4/3, and Cheung (SIOPT 2005) proved that it remains at least 4/3 after $o(n)$ rounds of the Lov{á}sz-Schrijver procedure, where $n$ is the number of nodes. For the symmetric TSP path problem, the integrality gap of the standard relaxation is known to be at least 3/2, and we prove that it remains at least 3/2 after $o(n)$ rounds of the Lov{á}sz-Schrijver procedure, by a simple reduction to Cheung's result.
△ Less
Submitted 6 July, 2011;
originally announced July 2011.
-
Using Hierarchical Data Mining to Characterize Performance of Wireless System Configurations
Authors:
Alex Verstak,
Naren Ramakrishnan,
Kyung Kyoon Bae,
William H. Tranter,
Layne T. Watson,
Jian He,
Clifford A. Shaffer,
Theodore S. Rappaport
Abstract:
This paper presents a statistical framework for assessing wireless systems performance using hierarchical data mining techniques. We consider WCDMA (wideband code division multiple access) systems with two-branch STTD (space time transmit diversity) and 1/2 rate convolutional coding (forward error correction codes). Monte Carlo simulation estimates the bit error probability (BEP) of the system a…
▽ More
This paper presents a statistical framework for assessing wireless systems performance using hierarchical data mining techniques. We consider WCDMA (wideband code division multiple access) systems with two-branch STTD (space time transmit diversity) and 1/2 rate convolutional coding (forward error correction codes). Monte Carlo simulation estimates the bit error probability (BEP) of the system across a wide range of signal-to-noise ratios (SNRs). A performance database of simulation runs is collected over a targeted space of system configurations. This database is then mined to obtain regions of the configuration space that exhibit acceptable average performance. The shape of the mined regions illustrates the joint influence of configuration parameters on system performance. The role of data mining in this application is to provide explainable and statistically valid design conclusions. The research issue is to define statistically meaningful aggregation of data in a manner that permits efficient and effective data mining algorithms. We achieve a good compromise between these goals and help establish the applicability of data mining for characterizing wireless systems performance.
△ Less
Submitted 25 August, 2002;
originally announced August 2002.
-
BSML: A Binding Schema Markup Language for Data Interchange in Problem Solving Environments (PSEs)
Authors:
Alex Verstak,
Naren Ramakrishnan,
Layne T. Watson,
Jian He,
Clifford A. Shaffer,
Kyung Kyoon Bae,
**g Jiang,
William H. Tranter,
Theodore S. Rappaport
Abstract:
We describe a binding schema markup language (BSML) for describing data interchange between scientific codes. Such a facility is an important constituent of scientific problem solving environments (PSEs). BSML is designed to integrate with a PSE or application composition system that views model specification and execution as a problem of managing semistructured data. The data interchange proble…
▽ More
We describe a binding schema markup language (BSML) for describing data interchange between scientific codes. Such a facility is an important constituent of scientific problem solving environments (PSEs). BSML is designed to integrate with a PSE or application composition system that views model specification and execution as a problem of managing semistructured data. The data interchange problem is addressed by three techniques for processing semistructured data: validation, binding, and conversion. We present BSML and describe its application to a PSE for wireless communications system design.
△ Less
Submitted 18 February, 2002;
originally announced February 2002.