Search | arXiv e-print repository

arXiv:2402.18288 [pdf, other]

Development of Context-Sensitive Formulas to Obtain Constant Luminance Perception for a Foreground Object in Front of Backgrounds of Varying Luminance

Authors: Ergun Akleman, Bekir Tevfik Akgun, Adil Alpkocak

Abstract: In this article, we present a framework for develo** context-sensitive luminance correction formulas that can produce constant luminance perception for foreground objects. Our formulas make the foreground object slightly translucent to mix with the blurred version of the background. This mix can quickly produce any desired illusion of luminance in foreground objects based on the luminance of the… ▽ More In this article, we present a framework for develo** context-sensitive luminance correction formulas that can produce constant luminance perception for foreground objects. Our formulas make the foreground object slightly translucent to mix with the blurred version of the background. This mix can quickly produce any desired illusion of luminance in foreground objects based on the luminance of the background. The translucency formula has only one parameter; the relative size of the foreground object, which is a number between zero and one. We have identified the general structure of the translucency formulas as a power function of the relative size of the foreground object. We have implemented a web-based interactive program in Shadertoy. Using this program, we determined the coefficients of the polynomial exponents of the power function. To intuitively control the coefficients of the polynomial functions, we have used a Bézier form. Our final translucency formula uses a quadratic polynomial and requires only three coefficients. We also identified a simpler affine formula, which requires only two coefficients. We made our program publicly available in Shadertoy so that anyone can access and improve it. In this article, we also explain how to intuitively change the polynomial part of the formula. Using our explanation, users change the polynomial part of the formula to obtain their own perceptively constant luminance. This can be used as a crowd-sourcing experiment for further improvement of the formula. △ Less

Submitted 28 February, 2024; originally announced February 2024.

Comments: 20 pages

arXiv:2402.10934 [pdf, other]

Projective Holder-Minkowski Colors: A Generalized Set of Commutative & Associative Operations with Inverse Elements for Representing and Manipulating Colors

Authors: Ergun Akleman, Somyung, Oh, Youyou Wang, Bekir Tevfik Akgun, Jianer Chen

Abstract: One of the key problems in dealing with color in rendering, shading, compositing, or image manipulation is that we do not have algebraic structures that support operations over colors. In this paper, we present an all-encompassing framework that can support a set of algebraic structures with associativity, commutativity, and inverse properties. To provide these three properties, we build our algeb… ▽ More One of the key problems in dealing with color in rendering, shading, compositing, or image manipulation is that we do not have algebraic structures that support operations over colors. In this paper, we present an all-encompassing framework that can support a set of algebraic structures with associativity, commutativity, and inverse properties. To provide these three properties, we build our algebraic structures on an extension of projective space by allowing for negative and complex numbers. These properties are important for (1) manipulating colors as periodic functions, (2) solving inverse problems dealing with colors, and (3) being consistent with the wave representation of the color. Allowance of negative and complex numbers is not a problem for practical applications, since we can always convert the results into desired range for display purposes as we do in High Dynamic Range imaging. This set of algebraic structures can be considered as a generalization of the Minkowski norm Lp in projective space. These structures also provide a new version of the generalized Holder average with associativity property. Our structures provide inverses of any operation by allowing for negative and complex numbers. These structures provide all properties of the generalized Holder average by providing a continuous bridge between the classical weighted average, harmonic mean, maximum, and minimum operations using a single parameter p. △ Less

Submitted 3 February, 2024; originally announced February 2024.

Comments: 37 pages

arXiv:2309.09756 [pdf, other]

Privileged to Predicted: Towards Sensorimotor Reinforcement Learning for Urban Driving

Authors: Ege Onat Özsüer, Barış Akgün, Fatma Güney

Abstract: Reinforcement Learning (RL) has the potential to surpass human performance in driving without needing any expert supervision. Despite its promise, the state-of-the-art in sensorimotor self-driving is dominated by imitation learning methods due to the inherent shortcomings of RL algorithms. Nonetheless, RL agents are able to discover highly successful policies when provided with privileged ground t… ▽ More Reinforcement Learning (RL) has the potential to surpass human performance in driving without needing any expert supervision. Despite its promise, the state-of-the-art in sensorimotor self-driving is dominated by imitation learning methods due to the inherent shortcomings of RL algorithms. Nonetheless, RL agents are able to discover highly successful policies when provided with privileged ground truth representations of the environment. In this work, we investigate what separates privileged RL agents from sensorimotor agents for urban driving in order to bridge the gap between the two. We propose vision-based deep learning models to approximate the privileged representations from sensor data. In particular, we identify aspects of state representation that are crucial for the success of the RL agent such as desired route generation and stop zone prediction, and propose solutions to gradually develop less privileged RL agents. We also observe that bird's-eye-view models trained on offline datasets do not generalize to online RL training due to distribution mismatch. Through rigorous evaluation on the CARLA simulation environment, we shed light on the significance of the state representations in RL for autonomous driving and point to unresolved challenges for future research. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: 7 pages

arXiv:2308.04263 [pdf, other]

BarlowRL: Barlow Twins for Data-Efficient Reinforcement Learning

Authors: Omer Veysel Cagatan, Baris Akgun

Abstract: This paper introduces BarlowRL, a data-efficient reinforcement learning agent that combines the Barlow Twins self-supervised learning framework with DER (Data-Efficient Rainbow) algorithm. BarlowRL outperforms both DER and its contrastive counterpart CURL on the Atari 100k benchmark. BarlowRL avoids dimensional collapse by enforcing information spread to the whole space. This helps RL algorithms t… ▽ More This paper introduces BarlowRL, a data-efficient reinforcement learning agent that combines the Barlow Twins self-supervised learning framework with DER (Data-Efficient Rainbow) algorithm. BarlowRL outperforms both DER and its contrastive counterpart CURL on the Atari 100k benchmark. BarlowRL avoids dimensional collapse by enforcing information spread to the whole space. This helps RL algorithms to utilize uniformly spread state representation that eventually results in a remarkable performance. The integration of Barlow Twins with DER enhances data efficiency and achieves superior performance in the RL tasks. BarlowRL demonstrates the potential of incorporating self-supervised learning techniques to improve RL algorithms. △ Less

Submitted 12 October, 2023; v1 submitted 8 August, 2023; originally announced August 2023.

Comments: ACML 2023, Camera-Ready Version

arXiv:2301.08184 [pdf, other]

Keyframe Demonstration Seeded and Bayesian Optimized Policy Search

Authors: Onur Berk Tore, Farzin Negahbani, Baris Akgun

Abstract: This paper introduces a novel Learning from Demonstration framework to learn robotic skills with keyframe demonstrations using a Dynamic Bayesian Network (DBN) and a Bayesian Optimized Policy Search approach to improve the learned skills. DBN learns the robot motion, perceptual change in the object of interest (aka skill sub-goals) and the relation between them. The rewards are also learned from t… ▽ More This paper introduces a novel Learning from Demonstration framework to learn robotic skills with keyframe demonstrations using a Dynamic Bayesian Network (DBN) and a Bayesian Optimized Policy Search approach to improve the learned skills. DBN learns the robot motion, perceptual change in the object of interest (aka skill sub-goals) and the relation between them. The rewards are also learned from the perceptual part of the DBN. The policy search part is a semiblack box algorithm, which we call BO-PI2 . It utilizes the action-perception relation to focus the high-level exploration, uses Gaussian Processes to model the expected-return and performs Upper Confidence Bound type low-level exploration for sampling the rollouts. BO-PI2 is compared against a stateof-the-art method on three different skills in a real robot setting with expert and naive user demonstrations. The results show that our approach successfully focuses the exploration on the failed sub-goals and the addition of reward-predictive exploration outperforms the state-of-the-art approach on cumulative reward, skill success, and termination time metrics. △ Less

Submitted 19 January, 2023; originally announced January 2023.

arXiv:2212.07567 [pdf, other]

Learning Markerless Robot-Depth Camera Calibration and End-Effector Pose Estimation

Authors: Bugra C. Sefercik, Baris Akgun

Abstract: Traditional approaches to extrinsic calibration use fiducial markers and learning-based approaches rely heavily on simulation data. In this work, we present a learning-based markerless extrinsic calibration system that uses a depth camera and does not rely on simulation data. We learn models for end-effector (EE) segmentation, single-frame rotation prediction and keypoint detection, from automatic… ▽ More Traditional approaches to extrinsic calibration use fiducial markers and learning-based approaches rely heavily on simulation data. In this work, we present a learning-based markerless extrinsic calibration system that uses a depth camera and does not rely on simulation data. We learn models for end-effector (EE) segmentation, single-frame rotation prediction and keypoint detection, from automatically generated real-world data. We use a transformation trick to get EE pose estimates from rotation predictions and a matching algorithm to get EE pose estimates from keypoint predictions. We further utilize the iterative closest point algorithm, multiple-frames, filtering and outlier detection to increase calibration robustness. Our evaluations with training data from multiple camera poses and test data from previously unseen poses give sub-centimeter and sub-deciradian average calibration and pose estimation errors. We also show that a carefully selected single training pose gives comparable results. △ Less

Submitted 14 December, 2022; originally announced December 2022.

Comments: 8 pages, 6 figures, Conference on Robot Learning

ACM Class: I.2.9

arXiv:2212.07179 [pdf, other]

doi 10.1016/j.iot.2022.100638

FLAGS Framework for Comparative Analysis of Federated Learning Algorithms

Authors: Ahnaf Hannan Lodhi, Barış Akgün, Öznur Özkasap

Abstract: Federated Learning (FL) has become a key choice for distributed machine learning. Initially focused on centralized aggregation, recent works in FL have emphasized greater decentralization to adapt to the highly heterogeneous network edge. Among these, Hierarchical, Device-to-Device and Gossip Federated Learning (HFL, D2DFL \& GFL respectively) can be considered as foundational FL algorithms employ… ▽ More Federated Learning (FL) has become a key choice for distributed machine learning. Initially focused on centralized aggregation, recent works in FL have emphasized greater decentralization to adapt to the highly heterogeneous network edge. Among these, Hierarchical, Device-to-Device and Gossip Federated Learning (HFL, D2DFL \& GFL respectively) can be considered as foundational FL algorithms employing fundamental aggregation strategies. A number of FL algorithms were subsequently proposed employing multiple fundamental aggregation schemes jointly. Existing research, however, subjects the FL algorithms to varied conditions and gauges the performance of these algorithms mainly against Federated Averaging (FedAvg) only. This work consolidates the FL landscape and offers an objective analysis of the major FL algorithms through a comprehensive cross-evaluation for a wide range of operating conditions. In addition to the three foundational FL algorithms, this work also analyzes six derived algorithms. To enable a uniform assessment, a multi-FL framework named FLAGS: Federated Learning AlGorithms Simulation has been developed for rapid configuration of multiple FL algorithms. Our experiments indicate that fully decentralized FL algorithms achieve comparable accuracy under multiple operating conditions, including asynchronous aggregation and the presence of stragglers. Furthermore, decentralized FL can also operate in noisy environments and with a comparably higher local update rate. However, the impact of extremely skewed data distributions on decentralized FL is much more adverse than on centralized variants. The results indicate that it may not be necessary to restrict the devices to a single FL algorithm; rather, multi-FL nodes may operate with greater efficiency. △ Less

Submitted 14 December, 2022; originally announced December 2022.

Comments: 39 pages, 10 figures. Accepted for publication in Elsevier 'Internet of Things'

Journal ref: Internet of Things Volume 20, November 2022, 100638

arXiv:2111.04780 [pdf, other]

Frustum Fusion: Pseudo-LiDAR and LiDAR Fusion for 3D Detection

Authors: Farzin Negahbani, Onur Berk Töre, Fatma Güney, Baris Akgun

Abstract: Most autonomous vehicles are equipped with LiDAR sensors and stereo cameras. The former is very accurate but generates sparse data, whereas the latter is dense, has rich texture and color information but difficult to extract robust 3D representations from. In this paper, we propose a novel data fusion algorithm to combine accurate point clouds with dense but less accurate point clouds obtained fro… ▽ More Most autonomous vehicles are equipped with LiDAR sensors and stereo cameras. The former is very accurate but generates sparse data, whereas the latter is dense, has rich texture and color information but difficult to extract robust 3D representations from. In this paper, we propose a novel data fusion algorithm to combine accurate point clouds with dense but less accurate point clouds obtained from stereo pairs. We develop a framework to integrate this algorithm into various 3D object detection methods. Our framework starts with 2D detections from both of the RGB images, calculates frustums and their intersection, creates Pseudo-LiDAR data from the stereo images, and fills in the parts of the intersection region where the LiDAR data is lacking with the dense Pseudo-LiDAR points. We train multiple 3D object detection methods and show that our fusion strategy consistently improves the performance of detectors. △ Less

Submitted 8 November, 2021; originally announced November 2021.

ACM Class: I.4.8

arXiv:2008.00824 [pdf, other]

State-of-the-art Techniques in Deep Edge Intelligence

Authors: Ahnaf Hannan Lodhi, Barış Akgün, Öznur Özkasap

Abstract: The potential held by the gargantuan volumes of data being generated across networks worldwide has been truly unlocked by machine learning techniques and more recently Deep Learning. The advantages offered by the latter have seen it rapidly becoming a framework of choice for various applications. However, the centralization of computational resources and the need for data aggregation have long bee… ▽ More The potential held by the gargantuan volumes of data being generated across networks worldwide has been truly unlocked by machine learning techniques and more recently Deep Learning. The advantages offered by the latter have seen it rapidly becoming a framework of choice for various applications. However, the centralization of computational resources and the need for data aggregation have long been limiting factors in the democratization of Deep Learning applications. Edge Computing is an emerging paradigm that aims to utilize the hitherto untapped processing resources available at the network periphery. Edge Intelligence (EI) has quickly emerged as a powerful alternative to enable learning using the concepts of Edge Computing. Deep Learning-based Edge Intelligence or Deep Edge Intelligence (DEI) lies in this rapidly evolving domain. In this article, we provide an overview of the major constraints in operationalizing DEI. The major research avenues in DEI have been consolidated under Federated Learning, Distributed Computation, Compression Schemes and Conditional Computation. We also present some of the prevalent challenges and highlight prospective research avenues. △ Less

Submitted 24 December, 2020; v1 submitted 3 August, 2020; originally announced August 2020.

Comments: 13 pages, 5 figures, 1 table

arXiv:1811.00876 [pdf, other]

Mind in the Machine: Perceived Minds Induce Decision Change

Authors: Deniz Lefkeli, Baris Akgun, Sahibzada Omar, Aansa Malik, Zeynep Gurhan Canli, Terry Eskenazi

Abstract: Recent research on human robot interaction explored whether people's tendency to conform to others extends to artificial agents (Hertz & Wiese, 2016). However, little is known about to what extent perception of a robot as having a mind affects people's decisions. Grounded on the theory of mind perception, the current study proposes that artificial agents can induce decision change to the extent in… ▽ More Recent research on human robot interaction explored whether people's tendency to conform to others extends to artificial agents (Hertz & Wiese, 2016). However, little is known about to what extent perception of a robot as having a mind affects people's decisions. Grounded on the theory of mind perception, the current study proposes that artificial agents can induce decision change to the extent in which individuals perceive them as having minds. By varying the degree to which robots expressed ability to act (agency) or feel (experience), we specifically investigated the underlying mechanisms of mind attribution to robots and social influence. Our results show an interactive effect of perceived experience and perceived agency on social influence induced by artificial agents. The findings provide preliminary insights regarding autonomous robots' influence on individuals' decisions and form a basis for understanding the underlying dynamics of decision making with robots. △ Less

Submitted 2 November, 2018; originally announced November 2018.

arXiv:1803.09689 [pdf, other]

Flow From Motion: A Deep Learning Approach

Authors: Cem Eteke, Hayati Havlucu, Nisa İrem Kırbaç, Mehmet Cengiz Onbaşlı, Aykut Coşkun, Terry Eskenazi, Oğuzhan Özcan, Barış Akgün

Abstract: Wearable devices have the potential to enhance sports performance, yet they are not fulfilling this promise. Our previous studies with 6 professional tennis coaches and 20 players indicate that this could be due the lack of psychological or mental state feedback, which the coaches claim to provide. Towards this end, we propose to detect the flow state, mental state of optimal performance, using we… ▽ More Wearable devices have the potential to enhance sports performance, yet they are not fulfilling this promise. Our previous studies with 6 professional tennis coaches and 20 players indicate that this could be due the lack of psychological or mental state feedback, which the coaches claim to provide. Towards this end, we propose to detect the flow state, mental state of optimal performance, using wearables data to be later used in training. We performed a study with a professional tennis coach and two players. The coach provided labels about the players' flow state while each player had a wearable device on their racket holding wrist. We trained multiple models using the wearables data and the coach labels. Our deep neural network models achieved around 98% testing accuracy for a variety of conditions. This suggests that the flow state or what coaches recognize as flow, can be detected using wearables data in tennis which is a novel result. The implication for the HCI community is that having access to such information would allow for design of novel hardware and interaction paradigms that would be helpful in professional athlete training. △ Less

Submitted 26 March, 2018; originally announced March 2018.

Comments: 7 pages, 2 figures, 2 tables

arXiv:1710.02796 [pdf, ps, other]

doi 10.1109/TIFS.2018.2876750

Vulnerabilities of Massive MIMO Systems Against Pilot Contamination Attacks

Authors: Berk Akgun, Marwan Krunz, O. Ozan Koyluoglu

Abstract: We consider a single-cell massive MIMO system in which a base station (BS) with a large number of antennas transmits simultaneously to several single-antenna users in the presence of an attacker.The BS acquires the channel state information (CSI) based on uplink pilot transmissions. In this work, we demonstrate the vulnerability of CSI estimation phase to malicious attacks. For that purpose, we st… ▽ More We consider a single-cell massive MIMO system in which a base station (BS) with a large number of antennas transmits simultaneously to several single-antenna users in the presence of an attacker.The BS acquires the channel state information (CSI) based on uplink pilot transmissions. In this work, we demonstrate the vulnerability of CSI estimation phase to malicious attacks. For that purpose, we study two attack models. In the first model, the attacker aims at minimizing the sum-rate of downlink transmissions by contaminating the uplink pilots. In the second model, the attacker exploits its in-band full-duplex capabilities to generate jamming signals in both the CSI estimation and data transmission phases. We study these attacks under two downlink power allocation strategies when the attacker knows and does not know the locations of the BS and users. The formulated problems are solved using stochastic optimization, Lagrangian minimization, and game-theoretic methods. A closed-form solution for a special case of the problem is obtained. Furthermore, we analyze the achievable individual secrecy rates under a pilot contamination attack, and provide an upper bound on these rates. Our results indicate that the proposed attacks degrade the throughput of a massive MIMO system by more than half. △ Less

Submitted 8 October, 2017; originally announced October 2017.

arXiv:1604.01835 [pdf, ps, other]

doi 10.1109/TCOMM.2016.2641949

Exploiting Full-duplex Receivers for Achieving Secret Communications in Multiuser MISO Networks

Authors: Berk Akgun, O. Ozan Koyluoglu, Marwan Krunz

Abstract: We consider a broadcast channel, in which a multi-antenna transmitter (Alice) sends $K$ confidential information signals to $K$ legitimate users (Bobs) in the presence of $L$ eavesdroppers (Eves). Alice uses MIMO precoding to generate the information signals along with her own (Tx-based) friendly jamming. Interference at each Bob is removed by MIMO zero-forcing. This, however, leaves a "vulnerabil… ▽ More We consider a broadcast channel, in which a multi-antenna transmitter (Alice) sends $K$ confidential information signals to $K$ legitimate users (Bobs) in the presence of $L$ eavesdroppers (Eves). Alice uses MIMO precoding to generate the information signals along with her own (Tx-based) friendly jamming. Interference at each Bob is removed by MIMO zero-forcing. This, however, leaves a "vulnerability region" around each Bob, which can be exploited by a nearby Eve. We address this problem by augmenting Tx-based friendly jamming (TxFJ) with Rx-based friendly jamming (RxFJ), generated by each Bob. Specifically, each Bob uses self-interference suppression (SIS) to transmit a friendly jamming signal while simultaneously receiving an information signal over the same channel. We minimize the powers allocated to the information, TxFJ, and RxFJ signals under given guarantees on the individual secrecy rate for each Bob. The problem is solved for the cases when the eavesdropper's channel state information is known/unknown. Simulations show the effectiveness of the proposed solution. Furthermore, we discuss how to schedule transmissions when the rate requirements need to be satisfied on average rather than instantaneously. Under special cases, a scheduling algorithm that serves only the strongest receivers is shown to outperform the one that schedules all receivers. △ Less

Submitted 9 January, 2017; v1 submitted 6 April, 2016; originally announced April 2016.

Comments: IEEE Transactions on Communications

Showing 1–13 of 13 results for author: Akgün, B