-
Cross-Cultural Validation of Partner Models for Voice User Interfaces
Authors:
Katie Seaborn,
Iona Gessinger,
Suzuka Yoshida,
Benjamin R. Cowan,
Philip R. Doyle
Abstract:
Recent research has begun to assess people's perceptions of voice user interfaces (VUIs) as dialogue partners, termed partner models. Current self-report measures are only available in English, limiting research to English-speaking users. To improve the diversity of user samples and contexts that inform partner modelling research, we translated, localized, and evaluated the Partner Modelling Quest…
▽ More
Recent research has begun to assess people's perceptions of voice user interfaces (VUIs) as dialogue partners, termed partner models. Current self-report measures are only available in English, limiting research to English-speaking users. To improve the diversity of user samples and contexts that inform partner modelling research, we translated, localized, and evaluated the Partner Modelling Questionnaire (PMQ) for non-English speaking Western (German, n=185) and East Asian (Japanese, n=198) cohorts where VUI use is popular. Through confirmatory factor analysis (CFA), we find that the scale produces equivalent levels of goodness-to-fit for both our German and Japanese translations, confirming its cross-cultural validity. Still, the structure of the communicative flexibility factor did not replicate directly across Western and East Asian cohorts. We discuss how our translations can open up critical research on cultural similarities and differences in partner model use and design, whilst highlighting the challenges for ensuring accurate translation across cultural contexts.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Deceptive, Disruptive, No Big Deal: Japanese People React to Simulated Dark Commercial Patterns
Authors:
Katie Seaborn,
Tatsuya Itagaki,
Mizuki Watanabe,
Yijia Wang,
** Geng,
Takao Fujii,
Yuto Mandai,
Miu Kojima,
Suzuka Yoshida
Abstract:
Dark patterns and deceptive designs (DPs) are user interface elements that trick people into taking actions that benefit the purveyor. Such designs are widely deployed, with special varieties found in certain nations like Japan that can be traced to global power hierarchies and the local socio-linguistic context of use. In this breaking work, we report on the first user study involving Japanese pe…
▽ More
Dark patterns and deceptive designs (DPs) are user interface elements that trick people into taking actions that benefit the purveyor. Such designs are widely deployed, with special varieties found in certain nations like Japan that can be traced to global power hierarchies and the local socio-linguistic context of use. In this breaking work, we report on the first user study involving Japanese people (n=30) experiencing a mock shop** website injected with simulated DPs. We found that Alphabet Soup and Misleading Reference Pricing were the most deceptive and least noticeable. Social Proofs, Sneaking in Items, and Untranslation were the least deceptive but Untranslation prevented most from cancelling their account. Mood significantly worsened after experiencing the website. We contribute the first empirical findings on a Japanese consumer base alongside a scalable approach to evaluating user attitudes, perceptions, and behaviours towards DPs in an interactive context. We urge for more human participant research and ideally collaborations with industry to assess real designs in the wild.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Robot Swarm Control Based on Smoothed Particle Hydrodynamics for Obstacle-Unaware Navigation
Authors:
Michikuni Eguchi,
Mai Nishimura,
Shigeo Yoshida,
Takefumi Hiraki
Abstract:
Robot swarms hold immense potential for performing complex tasks far beyond the capabilities of individual robots. However, the challenge in unleashing this potential is the robots' limited sensory capabilities, which hinder their ability to detect and adapt to unknown obstacles in real-time. To overcome this limitation, we introduce a novel robot swarm control method with an indirect obstacle det…
▽ More
Robot swarms hold immense potential for performing complex tasks far beyond the capabilities of individual robots. However, the challenge in unleashing this potential is the robots' limited sensory capabilities, which hinder their ability to detect and adapt to unknown obstacles in real-time. To overcome this limitation, we introduce a novel robot swarm control method with an indirect obstacle detector using a smoothed particle hydrodynamics (SPH) model. The indirect obstacle detector can predict the collision with an obstacle and its collision point solely from the robot's velocity information. This approach enables the swarm to effectively and accurately navigate environments without the need for explicit obstacle detection, significantly enhancing their operational robustness and efficiency. Our method's superiority is quantitatively validated through a comparative analysis, showcasing its significant navigation and pattern formation improvements under obstacle-unaware conditions.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Swarm Body: Embodied Swarm Robots
Authors:
Sosuke Ichihashi,
So Kuroki,
Mai Nishimura,
Kazumi Kasaura,
Takefumi Hiraki,
Kazutoshi Tanaka,
Shigeo Yoshida
Abstract:
The human brain's plasticity allows for the integration of artificial body parts into the human body. Leveraging this, embodied systems realize intuitive interactions with the environment. We introduce a novel concept: embodied swarm robots. Swarm robots constitute a collective of robots working in harmony to achieve a common objective, in our case, serving as functional body parts. Embodied swarm…
▽ More
The human brain's plasticity allows for the integration of artificial body parts into the human body. Leveraging this, embodied systems realize intuitive interactions with the environment. We introduce a novel concept: embodied swarm robots. Swarm robots constitute a collective of robots working in harmony to achieve a common objective, in our case, serving as functional body parts. Embodied swarm robots can dynamically alter their shape, density, and the correspondences between body parts and individual robots. We contribute an investigation of the influence on embodiment of swarm robot-specific factors derived from these characteristics, focusing on a hand. Our paper is the first to examine these factors through virtual reality (VR) and real-world robot studies to provide essential design considerations and applications of embodied swarm robots. Through quantitative and qualitative analysis, we identified a system configuration to achieve the embodiment of swarm robots.
△ Less
Submitted 29 February, 2024; v1 submitted 24 February, 2024;
originally announced February 2024.
-
Perceptual Dimensions of Physical Properties of Handheld Objects Induced by Impedance Changes
Authors:
Takeru Hashimoto,
Shigeo Yoshida,
Takuji Narumi
Abstract:
Haptics in virtual reality is the emerging dimension after audiovisual experiences. Researchers designed several handheld VR controllers to simulate haptic experiences in virtual reality environments. Some of these devices, equipped to deliver active force, can dynamically alter the timing and intensity of force feedback, potentially offering a wide array of haptic sensations. Past research primar…
▽ More
Haptics in virtual reality is the emerging dimension after audiovisual experiences. Researchers designed several handheld VR controllers to simulate haptic experiences in virtual reality environments. Some of these devices, equipped to deliver active force, can dynamically alter the timing and intensity of force feedback, potentially offering a wide array of haptic sensations. Past research primarily used a single index to evaluate how users perceive physical property parameters, potentially limiting the assessment to the designer's intended scope and neglecting other potential perceptual experiences.
Therefore, this study evaluates not how much but how humans feel a physical property when stimuli are changed. We conducted interviews to investigate how people feel when a haptic device changes motion impedance. We used thematic analysis to abstract the results of the interviews and gain an understanding of how humans attribute force feedback to a phenomenon. We also generated a vocabulary from the themes obtained from the interviews and asked users to evaluate force feedback using the semantic difference method. A factor analysis was used to investigate how changing the basic elements of motion, such as inertia, viscosity, and stiffness of the motion system, affects haptic perception. As a result, we obtained four critical factors: size, viscosity, weight, and flexibility factor, and clarified the correspondence between these factors and the change of impedance.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
High-speed full-color computer-generated holography using a digital micromirror device and fiber-coupled RGB laser diode
Authors:
Shuhei Yoshida
Abstract:
Computer-generated holography (CGH) can be used to display three-dimensional (3D) images and has a special feature that no other technology possesses: it can reconstruct arbitrary object wavefronts. In this study, we investigated a high-speed full-color reconstruction method for improving the realism of 3D images produced using CGH. The proposed method uses a digital micromirror device (DMD) with…
▽ More
Computer-generated holography (CGH) can be used to display three-dimensional (3D) images and has a special feature that no other technology possesses: it can reconstruct arbitrary object wavefronts. In this study, we investigated a high-speed full-color reconstruction method for improving the realism of 3D images produced using CGH. The proposed method uses a digital micromirror device (DMD) with a high-speed switching capability as the hologram display device. It produces 3D video by time-division multiplexing using an optical system incorporating fiber-coupled laser diodes (LDs) operating in red, green, and blue wavelengths. The wavelength dispersion of the DMD is compensated for by superimposing plane waves on the hologram. Fourier transform optics are used to separate the object, conjugate, and zeroth-order light, thus eliminating the need for an extensive 4f system. The resources used in this research, such as the programs used for the hologram generation and the schematics of the LD driver, are available on GitHub.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Observation of high-energy neutrinos from the Galactic plane
Authors:
R. Abbasi,
M. Ackermann,
J. Adams,
J. A. Aguilar,
M. Ahlers,
M. Ahrens,
J. M. Alameddine,
A. A. Alves Jr.,
N. M. Amin,
K. Andeen,
T. Anderson,
G. Anton,
C. Argüelles,
Y. Ashida,
S. Athanasiadou,
S. Axani,
X. Bai,
A. Balagopal V.,
S. W. Barwick,
V. Basu,
S. Baur,
R. Bay,
J. J. Beatty,
K. -H. Becker,
J. Becker Tjus
, et al. (364 additional authors not shown)
Abstract:
The origin of high-energy cosmic rays, atomic nuclei that continuously impact Earth's atmosphere, has been a mystery for over a century. Due to deflection in interstellar magnetic fields, cosmic rays from the Milky Way arrive at Earth from random directions. However, near their sources and during propagation, cosmic rays interact with matter and produce high-energy neutrinos. We search for neutrin…
▽ More
The origin of high-energy cosmic rays, atomic nuclei that continuously impact Earth's atmosphere, has been a mystery for over a century. Due to deflection in interstellar magnetic fields, cosmic rays from the Milky Way arrive at Earth from random directions. However, near their sources and during propagation, cosmic rays interact with matter and produce high-energy neutrinos. We search for neutrino emission using machine learning techniques applied to ten years of data from the IceCube Neutrino Observatory. We identify neutrino emission from the Galactic plane at the 4.5$σ$ level of significance, by comparing diffuse emission models to a background-only hypothesis. The signal is consistent with modeled diffuse emission from the Galactic plane, but could also arise from a population of unresolved point sources.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
Non-iterative optimization of pseudo-labeling thresholds for training object detection models from multiple datasets
Authors:
Yuki Tanaka,
Shuhei M. Yoshida,
Makoto Terao
Abstract:
We propose a non-iterative method to optimize pseudo-labeling thresholds for learning object detection from a collection of low-cost datasets, each of which is annotated for only a subset of all the object classes. A popular approach to this problem is first to train teacher models and then to use their confident predictions as pseudo ground-truth labels when training a student model. To obtain th…
▽ More
We propose a non-iterative method to optimize pseudo-labeling thresholds for learning object detection from a collection of low-cost datasets, each of which is annotated for only a subset of all the object classes. A popular approach to this problem is first to train teacher models and then to use their confident predictions as pseudo ground-truth labels when training a student model. To obtain the best result, however, thresholds for prediction confidence must be adjusted. This process typically involves iterative search and repeated training of student models and is time-consuming. Therefore, we develop a method to optimize the thresholds without iterative optimization by maximizing the $F_β$-score on a validation dataset, which measures the quality of pseudo labels and can be measured without training a student model. We experimentally demonstrate that our proposed method achieves an mAP comparable to that of grid search on the COCO and VOC datasets.
△ Less
Submitted 18 October, 2022;
originally announced October 2022.
-
Graph Neural Networks for Low-Energy Event Classification & Reconstruction in IceCube
Authors:
R. Abbasi,
M. Ackermann,
J. Adams,
N. Aggarwal,
J. A. Aguilar,
M. Ahlers,
M. Ahrens,
J. M. Alameddine,
A. A. Alves Jr.,
N. M. Amin,
K. Andeen,
T. Anderson,
G. Anton,
C. Argüelles,
Y. Ashida,
S. Athanasiadou,
S. Axani,
X. Bai,
A. Balagopal V.,
M. Baricevic,
S. W. Barwick,
V. Basu,
R. Bay,
J. J. Beatty,
K. -H. Becker
, et al. (359 additional authors not shown)
Abstract:
IceCube, a cubic-kilometer array of optical sensors built to detect atmospheric and astrophysical neutrinos between 1 GeV and 1 PeV, is deployed 1.45 km to 2.45 km below the surface of the ice sheet at the South Pole. The classification and reconstruction of events from the in-ice detectors play a central role in the analysis of data from IceCube. Reconstructing and classifying events is a challen…
▽ More
IceCube, a cubic-kilometer array of optical sensors built to detect atmospheric and astrophysical neutrinos between 1 GeV and 1 PeV, is deployed 1.45 km to 2.45 km below the surface of the ice sheet at the South Pole. The classification and reconstruction of events from the in-ice detectors play a central role in the analysis of data from IceCube. Reconstructing and classifying events is a challenge due to the irregular detector geometry, inhomogeneous scattering and absorption of light in the ice and, below 100 GeV, the relatively low number of signal photons produced per event. To address this challenge, it is possible to represent IceCube events as point cloud graphs and use a Graph Neural Network (GNN) as the classification and reconstruction method. The GNN is capable of distinguishing neutrino events from cosmic-ray backgrounds, classifying different neutrino event types, and reconstructing the deposited energy, direction and interaction vertex. Based on simulation, we provide a comparison in the 1-100 GeV energy range to the current state-of-the-art maximum likelihood techniques used in current IceCube analyses, including the effects of known systematic uncertainties. For neutrino event classification, the GNN increases the signal efficiency by 18% at a fixed false positive rate (FPR), compared to current IceCube methods. Alternatively, the GNN offers a reduction of the FPR by over a factor 8 (to below half a percent) at a fixed signal efficiency. For the reconstruction of energy, direction, and interaction vertex, the resolution improves by an average of 13%-20% compared to current maximum likelihood techniques in the energy range of 1-30 GeV. The GNN, when run on a GPU, is capable of processing IceCube events at a rate nearly double of the median IceCube trigger rate of 2.7 kHz, which opens the possibility of using low energy neutrinos in online searches for transient events.
△ Less
Submitted 11 October, 2022; v1 submitted 7 September, 2022;
originally announced September 2022.
-
MagGlove: A Haptic Glove with Movable Magnetic Force for Manipulation Learning
Authors:
Mikiya Kusunoki,
Shogo Yoshida,
Haoran Xie
Abstract:
Recently, haptic gloves have been extensively explored for various practical applications, such as manipulation learning. Previous glove devices have different force-driven systems, such as shape memory alloys, servo motors and pneumatic actuators; however, these proposed devices may have difficulty in fast finger movement, easy reproduction, and safety issues. In this study, we propose MagGlove,…
▽ More
Recently, haptic gloves have been extensively explored for various practical applications, such as manipulation learning. Previous glove devices have different force-driven systems, such as shape memory alloys, servo motors and pneumatic actuators; however, these proposed devices may have difficulty in fast finger movement, easy reproduction, and safety issues. In this study, we propose MagGlove, a novel haptic glove with a movable magnet mechanism that has a linear motor, to solve these issues. The proposed MagGlove device is a compact system on the back of the wearer's hand with high responsiveness, ease of use, and good safety. The proposed device is adaptive with the modification of the magnitude of the current flowing through the coil. Based on our evaluation study, it is verified that the proposed device can achieve finger motion in the given tasks. Therefore, MagGlove can provide flexible support tailored to the wearers' learning levels in manipulation learning tasks.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
Towards Interpretable and Reliable Reading Comprehension: A Pipeline Model with Unanswerability Prediction
Authors:
Kosuke Nishida,
Kyosuke Nishida,
Itsumi Saito,
Sen Yoshida
Abstract:
Multi-hop QA with annotated supporting facts, which is the task of reading comprehension (RC) considering the interpretability of the answer, has been extensively studied. In this study, we define an interpretable reading comprehension (IRC) model as a pipeline model with the capability of predicting unanswerable queries. The IRC model justifies the answer prediction by establishing consistency be…
▽ More
Multi-hop QA with annotated supporting facts, which is the task of reading comprehension (RC) considering the interpretability of the answer, has been extensively studied. In this study, we define an interpretable reading comprehension (IRC) model as a pipeline model with the capability of predicting unanswerable queries. The IRC model justifies the answer prediction by establishing consistency between the predicted supporting facts and the actual rationale for interpretability. The IRC model detects unanswerable questions, instead of outputting the answer forcibly based on the insufficient information, to ensure the reliability of the answer. We also propose an end-to-end training method for the pipeline RC model. To evaluate the interpretability and the reliability, we conducted the experiments considering unanswerability in a multi-hop question for a given passage. We show that our end-to-end trainable pipeline model outperformed a non-interpretable model on our modified HotpotQA dataset. Experimental results also show that the IRC model achieves comparable results to the previous non-interpretable models in spite of the trade-off between prediction performance and interpretability.
△ Less
Submitted 18 November, 2021; v1 submitted 17 November, 2021;
originally announced November 2021.
-
Joint Multimedia Event Extraction from Video and Article
Authors:
Brian Chen,
Xudong Lin,
Christopher Thomas,
Manling Li,
Shoya Yoshida,
Lovish Chum,
Heng Ji,
Shih-Fu Chang
Abstract:
Visual and textual modalities contribute complementary information about events described in multimedia documents. Videos contain rich dynamics and detailed unfoldings of events, while text describes more high-level and abstract concepts. However, existing event extraction methods either do not handle video or solely target video while ignoring other modalities. In contrast, we propose the first a…
▽ More
Visual and textual modalities contribute complementary information about events described in multimedia documents. Videos contain rich dynamics and detailed unfoldings of events, while text describes more high-level and abstract concepts. However, existing event extraction methods either do not handle video or solely target video while ignoring other modalities. In contrast, we propose the first approach to jointly extract events from video and text articles. We introduce the new task of Video MultiMedia Event Extraction (Video M2E2) and propose two novel components to build the first system towards this task. First, we propose the first self-supervised multimodal event coreference model that can determine coreference between video events and text events without any manually annotated pairs. Second, we introduce the first multimodal transformer which extracts structured event information jointly from both videos and text documents. We also construct and will publicly release a new benchmark of video-article pairs, consisting of 860 video-article pairs with extensive annotations for evaluating methods on this task. Our experimental results demonstrate the effectiveness of our proposed method on our new benchmark dataset. We achieve 6.0% and 5.8% absolute F-score gain on multimodal event coreference resolution and multimedia event extraction.
△ Less
Submitted 26 September, 2021;
originally announced September 2021.
-
Task-adaptive Pre-training of Language Models with Word Embedding Regularization
Authors:
Kosuke Nishida,
Kyosuke Nishida,
Sen Yoshida
Abstract:
Pre-trained language models (PTLMs) acquire domain-independent linguistic knowledge through pre-training with massive textual resources. Additional pre-training is effective in adapting PTLMs to domains that are not well covered by the pre-training corpora. Here, we focus on the static word embeddings of PTLMs for domain adaptation to teach PTLMs domain-specific meanings of words. We propose a nov…
▽ More
Pre-trained language models (PTLMs) acquire domain-independent linguistic knowledge through pre-training with massive textual resources. Additional pre-training is effective in adapting PTLMs to domains that are not well covered by the pre-training corpora. Here, we focus on the static word embeddings of PTLMs for domain adaptation to teach PTLMs domain-specific meanings of words. We propose a novel fine-tuning process: task-adaptive pre-training with word embedding regularization (TAPTER). TAPTER runs additional pre-training by making the static word embeddings of a PTLM close to the word embeddings obtained in the target domain with fastText. TAPTER requires no additional corpus except for the training data of the downstream task. We confirmed that TAPTER improves the performance of the standard fine-tuning and the task-adaptive pre-training on BioASQ (question answering in the biomedical domain) and on SQuAD (the Wikipedia domain) when their pre-training corpora were not dominated by in-domain data.
△ Less
Submitted 17 September, 2021;
originally announced September 2021.
-
NaviChoker: Augmenting Pressure Sensation via Pneumatic Actuator
Authors:
Shogo Yoshida,
Haoran Xie,
Kazunori Miyata
Abstract:
Many technologies have been developed in recent years to present audiovisual information in new ways, but develo** an information presentation interface to convey tactile information is still a challenge. We propose a tactile device using wearable technology that is an all-around pressure presentation system using pneumatic actuators. Specifically, we develop a system in which a choker equipped…
▽ More
Many technologies have been developed in recent years to present audiovisual information in new ways, but develo** an information presentation interface to convey tactile information is still a challenge. We propose a tactile device using wearable technology that is an all-around pressure presentation system using pneumatic actuators. Specifically, we develop a system in which a choker equipped with a pneumatic actuator is worn around the neck, that applies pressure in any direction to indicate to the user the direction in which to walk and also when to start and stop walking. In this paper, we describe the construction of the device, evaluation experiments, our assessment of the prototype, and future plans for the device.
△ Less
Submitted 10 September, 2021;
originally announced September 2021.
-
AugLimb: Compact Robotic Limb for Human Augmentation
Authors:
Zeyu Ding,
Shogo Yoshida,
Toby Chong,
Tsukasa Fukusato,
Takuma Torii,
Haoran Xie
Abstract:
This work proposes a compact robotic limb, AugLimb, that can augment our body functions and support the daily activities. AugLimb adopts the double-layer scissor unit for the extendable mechanism which can achieve 2.5 times longer than the forearm length. The proposed device can be mounted on the user's upper arm, and transform into compact state without obstruction to wearers. The proposed device…
▽ More
This work proposes a compact robotic limb, AugLimb, that can augment our body functions and support the daily activities. AugLimb adopts the double-layer scissor unit for the extendable mechanism which can achieve 2.5 times longer than the forearm length. The proposed device can be mounted on the user's upper arm, and transform into compact state without obstruction to wearers. The proposed device is lightweight with low burden exerted on the wearer. We developed the prototype of AugLimb to demonstrate the proposed mechanisms. We believe that the design methodology of AugLimb can facilitate human augmentation research for practical use. see http://www.jaist.ac.jp/~xie/auglimb.html
△ Less
Submitted 31 August, 2021;
originally announced September 2021.
-
Lower-Bounded Proper Losses for Weakly Supervised Classification
Authors:
Shuhei M. Yoshida,
Takashi Takenouchi,
Masashi Sugiyama
Abstract:
This paper discusses the problem of weakly supervised classification, in which instances are given weak labels that are produced by some label-corruption process. The goal is to derive conditions under which loss functions for weak-label learning are proper and lower-bounded -- two essential requirements for the losses used in class-probability estimation. To this end, we derive a representation t…
▽ More
This paper discusses the problem of weakly supervised classification, in which instances are given weak labels that are produced by some label-corruption process. The goal is to derive conditions under which loss functions for weak-label learning are proper and lower-bounded -- two essential requirements for the losses used in class-probability estimation. To this end, we derive a representation theorem for proper losses in supervised learning, which dualizes the Savage representation. We use this theorem to characterize proper weak-label losses and find a condition for them to be lower-bounded. From these theoretical findings, we derive a novel regularization scheme called generalized logit squeezing, which makes any proper weak-label loss bounded from below, without losing properness. Furthermore, we experimentally demonstrate the effectiveness of our proposed approach, as compared to improper or unbounded losses. The results highlight the importance of properness and lower-boundedness.
△ Less
Submitted 11 June, 2021; v1 submitted 4 March, 2021;
originally announced March 2021.
-
BPActuators: Lightweight and Low-Cost Soft Actuators by Balloons and Plastics
Authors:
Qiukai Qi,
Shogo Yoshida,
Genki Kakihana,
Takuma Torii,
Van Anh Ho,
Haoran Xie
Abstract:
To increase the awareness and impact, soft robotics needs to go beyond the lab environment and should be readily accessible to those even with no robotic expertise. However, most prevailing manufacturing methodologies require either professional equipment or materials that are not usually available to common people, thereby constraining the accessibility of soft robotics. In this communication, we…
▽ More
To increase the awareness and impact, soft robotics needs to go beyond the lab environment and should be readily accessible to those even with no robotic expertise. However, most prevailing manufacturing methodologies require either professional equipment or materials that are not usually available to common people, thereby constraining the accessibility of soft robotics. In this communication, we propose a lightweight and low-cost soft bending actuator, called BPActuator, that can be easily fabricated with plastics and balloons. We fabricated a range of actuators with various morphology for characterization in terms of deformation and load-bearing capacity, and demonstrated that they can bend up to 35 degrees and exert force at the tip around 0.070$\pm$0.015N, which is over 5 times higher than their average gravity. We further implemented a gripper with three fingers using the proposed actuators, and found that the gripper can realize human-like grasp of a range of daily objects. The gripper can lift objects at least 8 times heavier than its own weight. Furthermore, the BPActuator is cost effective and each costs about 0.22 USD. Given these advantages, the BPActuators are expected to significantly improve the accessibility of soft robotics to a wider group without robotic expertise.
△ Less
Submitted 4 March, 2021; v1 submitted 1 March, 2021;
originally announced March 2021.
-
A Convolutional Neural Network based Cascade Reconstruction for the IceCube Neutrino Observatory
Authors:
R. Abbasi,
M. Ackermann,
J. Adams,
J. A. Aguilar,
M. Ahlers,
M. Ahrens,
C. Alispach,
A. A. Alves Jr.,
N. M. Amin,
R. An,
K. Andeen,
T. Anderson,
I. Ansseau,
G. Anton,
C. Argüelles,
S. Axani,
X. Bai,
A. Balagopal V.,
A. Barbano,
S. W. Barwick,
B. Bastian,
V. Basu,
V. Baum,
S. Baur,
R. Bay
, et al. (343 additional authors not shown)
Abstract:
Continued improvements on existing reconstruction methods are vital to the success of high-energy physics experiments, such as the IceCube Neutrino Observatory. In IceCube, further challenges arise as the detector is situated at the geographic South Pole where computational resources are limited. However, to perform real-time analyses and to issue alerts to telescopes around the world, powerful an…
▽ More
Continued improvements on existing reconstruction methods are vital to the success of high-energy physics experiments, such as the IceCube Neutrino Observatory. In IceCube, further challenges arise as the detector is situated at the geographic South Pole where computational resources are limited. However, to perform real-time analyses and to issue alerts to telescopes around the world, powerful and fast reconstruction methods are desired. Deep neural networks can be extremely powerful, and their usage is computationally inexpensive once the networks are trained. These characteristics make a deep learning-based approach an excellent candidate for the application in IceCube. A reconstruction method based on convolutional architectures and hexagonally shaped kernels is presented. The presented method is robust towards systematic uncertainties in the simulation and has been tested on experimental data. In comparison to standard reconstruction methods in IceCube, it can improve upon the reconstruction accuracy, while reducing the time necessary to run the reconstruction by two to three orders of magnitude.
△ Less
Submitted 26 July, 2021; v1 submitted 27 January, 2021;
originally announced January 2021.
-
VisualMRC: Machine Reading Comprehension on Document Images
Authors:
Ryota Tanaka,
Kyosuke Nishida,
Sen Yoshida
Abstract:
Recent studies on machine reading comprehension have focused on text-level understanding but have not yet reached the level of human understanding of the visual layout and content of real-world documents. In this study, we introduce a new visual machine reading comprehension dataset, named VisualMRC, wherein given a question and a document image, a machine reads and comprehends texts in the image…
▽ More
Recent studies on machine reading comprehension have focused on text-level understanding but have not yet reached the level of human understanding of the visual layout and content of real-world documents. In this study, we introduce a new visual machine reading comprehension dataset, named VisualMRC, wherein given a question and a document image, a machine reads and comprehends texts in the image to answer the question in natural language. Compared with existing visual question answering (VQA) datasets that contain texts in images, VisualMRC focuses more on develo** natural language understanding and generation abilities. It contains 30,000+ pairs of a question and an abstractive answer for 10,000+ document images sourced from multiple domains of webpages. We also introduce a new model that extends existing sequence-to-sequence models, pre-trained with large-scale text corpora, to take into account the visual layout and content of documents. Experiments with VisualMRC show that this model outperformed the base sequence-to-sequence models and a state-of-the-art VQA model. However, its performance is still below that of humans on most automatic evaluation metrics. The dataset will facilitate research aimed at connecting vision and language understanding.
△ Less
Submitted 10 May, 2021; v1 submitted 27 January, 2021;
originally announced January 2021.
-
Coordinated Linear Precoding in Downlink Multicell MU-MISO OFDMA Networks
Authors:
Mirza Golam Kibria,
Hidekazu Murata,
Susumu Yoshida
Abstract:
This paper considers coordinated linear precoding in downlink multicell multiuser orthogonal frequency-division multiple access (OFDMA) network. A less-complex, fast and provably convergent algorithm that maximizes the weighted sum-rate with per base station (BS) transmit power constraint is formulated. We approximate the nonconvex weighted sum- rate maximization (WSRM) problem with a solvable con…
▽ More
This paper considers coordinated linear precoding in downlink multicell multiuser orthogonal frequency-division multiple access (OFDMA) network. A less-complex, fast and provably convergent algorithm that maximizes the weighted sum-rate with per base station (BS) transmit power constraint is formulated. We approximate the nonconvex weighted sum- rate maximization (WSRM) problem with a solvable convex form by means of sequential parametric convex approximation (SPCA) approach. The second order cone program (SOCP) formulations of the objective function and constraints of the optimization problem are derived through proper change of vari- ables, first order linear approximation and hyperbolic constraints transformation, etc. The algorithm converges to the suboptimal solution taking fewer number of iterations in comparison to other known iterative WSRM algorithms. Finally, numerical results are presented to justify the effectiveness and superiority of the proposed algorithm.
△ Less
Submitted 15 August, 2017;
originally announced August 2017.
-
Review: Modeling and Classical Controller Of Quad-rotor
Authors:
Tarek N. Dief,
Shigeo Yoshida
Abstract:
This paper presents an overview of the most effective ideas for the Quad-rotor project. The concept of modeling using different methods is presented. The modeling part presented the nonlinear model, and the concept of linearization using small disturbance theory. Parameter identifications part explained the most important parameters that affect the system stability and tried to get suitable soluti…
▽ More
This paper presents an overview of the most effective ideas for the Quad-rotor project. The concept of modeling using different methods is presented. The modeling part presented the nonlinear model, and the concept of linearization using small disturbance theory. Parameter identifications part explained the most important parameters that affect the system stability and tried to get suitable solutions for these problems and identify some parameters experimentally. Data filtration, Kalman filter, Structure design, motor distribution, aerodynamic effect, analysis of shroud and its effect on the resultant thrust were explained. The control part incorporates different classical schemes such as PD and PID controllers to stabilize the Quad-rotor. Also, different ideas are presented to stabilize the quad rotor using PID controllers with some modification to get high maneuverability and better performance.
△ Less
Submitted 15 July, 2017; v1 submitted 13 July, 2017;
originally announced July 2017.
-
Development of a Real-time Colorectal Tumor Classification System for Narrow-band Imaging zoom-videoendoscopy
Authors:
Tsubasa Hirakawa,
Toru Tamaki,
Bisser Raytchev,
Kazufumi Kaneda,
Tetsushi Koide,
Shigeto Yoshida,
Hiroshi Mieno,
Shinji Tanaka
Abstract:
Colorectal endoscopy is important for the early detection and treatment of colorectal cancer and is used worldwide. A computer-aided diagnosis (CAD) system that provides an objective measure to endoscopists during colorectal endoscopic examinations would be of great value. In this study, we describe a newly developed CAD system that provides real-time objective measures. Our system captures the vi…
▽ More
Colorectal endoscopy is important for the early detection and treatment of colorectal cancer and is used worldwide. A computer-aided diagnosis (CAD) system that provides an objective measure to endoscopists during colorectal endoscopic examinations would be of great value. In this study, we describe a newly developed CAD system that provides real-time objective measures. Our system captures the video stream from an endoscopic system and transfers it to a desktop computer. The captured video stream is then classified by a pretrained classifier and the results are displayed on a monitor. The experimental results show that our developed system works efficiently in actual endoscopic examinations and is medically significant.
△ Less
Submitted 20 December, 2016; v1 submitted 15 December, 2016;
originally announced December 2016.
-
Domain Adaptation with L2 constraints for classifying images from different endoscope systems
Authors:
Toru Tamaki,
Shoji Sonoyama,
Takio Kurita,
Tsubasa Hirakawa,
Bisser Raytchev,
Kazufumi Kaneda,
Tetsushi Koide,
Shigeto Yoshida,
Hiroshi Mieno,
Shinji Tanaka,
Kazuaki Chayama
Abstract:
This paper proposes a method for domain adaptation that extends the maximum margin domain transfer (MMDT) proposed by Hoffman et al., by introducing L2 distance constraints between samples of different domains; thus, our method is denoted as MMDTL2. Motivated by the differences between the images taken by narrow band imaging (NBI) endoscopic devices, we utilize different NBI devices as different d…
▽ More
This paper proposes a method for domain adaptation that extends the maximum margin domain transfer (MMDT) proposed by Hoffman et al., by introducing L2 distance constraints between samples of different domains; thus, our method is denoted as MMDTL2. Motivated by the differences between the images taken by narrow band imaging (NBI) endoscopic devices, we utilize different NBI devices as different domains and estimate the transformations between samples of different domains, i.e., image samples taken by different NBI endoscope systems. We first formulate the problem in the primal form, and then derive the dual form with much lesser computational costs as compared to the naive approach. From our experimental results using NBI image datasets from two different NBI endoscopic devices, we find that MMDTL2 is better than MMDT and also support vector machines without adaptation, especially when NBI image features are high-dimensional and the per-class training samples are greater than 20.
△ Less
Submitted 2 February, 2018; v1 submitted 8 November, 2016;
originally announced November 2016.
-
Transfer Learning for Endoscopic Image Classification
Authors:
Shoji Sonoyama,
Toru Tamaki,
Tsubasa Hirakawa,
Bisser Raytchev,
Kazufumi Kaneda,
Tetsushi Koide,
Shigeto Yoshida,
Hiroshi Mieno,
Shinji Tanaka
Abstract:
In this paper we propose a method for transfer learning of endoscopic images. For transferring between features obtained from images taken by different (old and new) endoscopes, we extend the Max-Margin Domain Transfer (MMDT) proposed by Hoffman et al. in order to use L2 distance constraints as regularization, called Max-Margin Domain Transfer with L2 Distance Constraints (MMDTL2). Furthermore, we…
▽ More
In this paper we propose a method for transfer learning of endoscopic images. For transferring between features obtained from images taken by different (old and new) endoscopes, we extend the Max-Margin Domain Transfer (MMDT) proposed by Hoffman et al. in order to use L2 distance constraints as regularization, called Max-Margin Domain Transfer with L2 Distance Constraints (MMDTL2). Furthermore, we develop the dual formulation of the optimization problem in order to reduce the computation cost. Experimental results demonstrate that the proposed MMDTL2 outperforms MMDT for real data sets taken by different endoscopes.
△ Less
Submitted 24 August, 2016;
originally announced August 2016.
-
Computer-Aided Colorectal Tumor Classification in NBI Endoscopy Using CNN Features
Authors:
Toru Tamaki,
Shoji Sonoyama,
Tsubasa Hirakawa,
Bisser Raytchev,
Kazufumi Kaneda,
Tetsushi Koide,
Shigeto Yoshida,
Hiroshi Mieno,
Shinji Tanaka
Abstract:
In this paper we report results for recognizing colorectal NBI endoscopic images by using features extracted from convolutional neural network (CNN). In this comparative study, we extract features from different layers from different CNN models, and then train linear SVM classifiers. Experimental results with 10-fold cross validations show that features from first few convolution layers are enough…
▽ More
In this paper we report results for recognizing colorectal NBI endoscopic images by using features extracted from convolutional neural network (CNN). In this comparative study, we extract features from different layers from different CNN models, and then train linear SVM classifiers. Experimental results with 10-fold cross validations show that features from first few convolution layers are enough to achieve similar performance (i.e., recognition rate of 95%) with non-CNN local features such as Bag-of-Visual words, Fisher vector, and VLAD.
△ Less
Submitted 24 August, 2016;
originally announced August 2016.
-
Interactive Restless Multi-armed Bandit Game and Swarm Intelligence Effect
Authors:
Shunsuke Yoshida,
Masato Hisakado,
Shintaro Mori
Abstract:
We obtain the conditions for the emergence of the swarm intelligence effect in an interactive game of restless multi-armed bandit (rMAB). A player competes with multiple agents. Each bandit has a payoff that changes with a probability $p_{c}$ per round. The agents and player choose one of three options: (1) Exploit (a good bandit), (2) Innovate (asocial learning for a good bandit among $n_{I}$ ran…
▽ More
We obtain the conditions for the emergence of the swarm intelligence effect in an interactive game of restless multi-armed bandit (rMAB). A player competes with multiple agents. Each bandit has a payoff that changes with a probability $p_{c}$ per round. The agents and player choose one of three options: (1) Exploit (a good bandit), (2) Innovate (asocial learning for a good bandit among $n_{I}$ randomly chosen bandits), and (3) Observe (social learning for a good bandit). Each agent has two parameters $(c,p_{obs})$ to specify the decision: (i) $c$, the threshold value for Exploit, and (ii) $p_{obs}$, the probability for Observe in learning. The parameters $(c,p_{obs})$ are uniformly distributed. We determine the optimal strategies for the player using complete knowledge about the rMAB. We show whether or not social or asocial learning is more optimal in the $(p_{c},n_{I})$ space and define the swarm intelligence effect. We conduct a laboratory experiment (67 subjects) and observe the swarm intelligence effect only if $(p_{c},n_{I})$ are chosen so that social learning is far more optimal than asocial learning.
△ Less
Submitted 13 March, 2015;
originally announced March 2015.
-
The IceProd Framework: Distributed Data Processing for the IceCube Neutrino Observatory
Authors:
M. G. Aartsen,
R. Abbasi,
M. Ackermann,
J. Adams,
J. A. Aguilar,
M. Ahlers,
D. Altmann,
C. Arguelles,
J. Auffenberg,
X. Bai,
M. Baker,
S. W. Barwick,
V. Baum,
R. Bay,
J. J. Beatty,
J. Becker Tjus,
K. -H. Becker,
S. BenZvi,
P. Berghaus,
D. Berley,
E. Bernardini,
A. Bernhard,
D. Z. Besson,
G. Binder,
D. Bindig
, et al. (262 additional authors not shown)
Abstract:
IceCube is a one-gigaton instrument located at the geographic South Pole, designed to detect cosmic neutrinos, iden- tify the particle nature of dark matter, and study high-energy neutrinos themselves. Simulation of the IceCube detector and processing of data require a significant amount of computational resources. IceProd is a distributed management system based on Python, XML-RPC and GridFTP. It…
▽ More
IceCube is a one-gigaton instrument located at the geographic South Pole, designed to detect cosmic neutrinos, iden- tify the particle nature of dark matter, and study high-energy neutrinos themselves. Simulation of the IceCube detector and processing of data require a significant amount of computational resources. IceProd is a distributed management system based on Python, XML-RPC and GridFTP. It is driven by a central database in order to coordinate and admin- ister production of simulations and processing of data produced by the IceCube detector. IceProd runs as a separate layer on top of other middleware and can take advantage of a variety of computing resources, including grids and batch systems such as CREAM, Condor, and PBS. This is accomplished by a set of dedicated daemons that process job submission in a coordinated fashion through the use of middleware plugins that serve to abstract the details of job submission and job management from the framework.
△ Less
Submitted 22 August, 2014; v1 submitted 22 November, 2013;
originally announced November 2013.