Quantifying Human Bias and Knowledge to guide ML models during Training
Authors:
Hrishikesh Viswanath,
Andrey Shor,
Yoshimasa Kitaguchi
Abstract:
This paper discusses a crowdsourcing based method that we designed to quantify the importance of different attributes of a dataset in determining the outcome of a classification problem. This heuristic, provided by humans acts as the initial weight seed for machine learning models and guides the model towards a better optimal during the gradient descent process. Often times when dealing with data,…
▽ More
This paper discusses a crowdsourcing based method that we designed to quantify the importance of different attributes of a dataset in determining the outcome of a classification problem. This heuristic, provided by humans acts as the initial weight seed for machine learning models and guides the model towards a better optimal during the gradient descent process. Often times when dealing with data, it is not uncommon to deal with skewed datasets, that over represent items of certain classes, while underrepresenting the rest. Skewed datasets may lead to unforeseen issues with models such as learning a biased function or overfitting. Traditional data augmentation techniques in supervised learning include oversampling and training with synthetic data. We introduce an experimental approach to dealing with such unbalanced datasets by including humans in the training process. We ask humans to rank the importance of features of the dataset, and through rank aggregation, determine the initial weight bias for the model. We show that collective human bias can allow ML models to learn insights about the true population instead of the biased sample. In this paper, we use two rank aggregator methods Kemeny Young and the Markov Chain aggregator to quantify human opinion on importance of features. This work mainly tests the effectiveness of human knowledge on binary classification (Popular vs Not-popular) problems on two ML models: Deep Neural Networks and Support Vector Machines. This approach considers humans as weak learners and relies on aggregation to offset individual biases and domain unfamiliarity.
△ Less
Submitted 19 November, 2022;
originally announced November 2022.
Experimental Analysis of Communication Relaying Delay in Low-Energy Ad-hoc Networks
Authors:
Taichi Miya,
Kohta Ohshima,
Yoshiaki Kitaguchi,
Katsunori Yamaoka
Abstract:
In recent years, more and more applications use ad-hoc networks for local M2M communications, but in some cases such as when using WSNs, the software processing delay induced by packets relaying may not be negligible. In this paper, we planned and carried out a delay measurement experiment using Raspberry Pi Zero W. The results demonstrated that, in low-energy ad-hoc networks, processing delay of…
▽ More
In recent years, more and more applications use ad-hoc networks for local M2M communications, but in some cases such as when using WSNs, the software processing delay induced by packets relaying may not be negligible. In this paper, we planned and carried out a delay measurement experiment using Raspberry Pi Zero W. The results demonstrated that, in low-energy ad-hoc networks, processing delay of the application is always too large to ignore; it is at least ten times greater than the kernel routing and corresponds to 30% of the transmission delay. Furthermore, if the task is CPU-intensive, such as packet encryption, the processing delay can be greater than the transmission delay and its behavior is represented by a simple linear model. Our findings indicate that the key factor for achieving QoS in ad-hoc networks is an appropriate node-to-node load balancing that takes into account the CPU performance and the amount of traffic passing through each node.
△ Less
Submitted 10 December, 2020; v1 submitted 29 October, 2020;
originally announced October 2020.