Skip to main content

Showing 1–50 of 228 results for author: Asano, Y

.
  1. Stable Tool-Use with Flexible Musculoskeletal Hands by Learning the Predictive Model of Sensor State Transition

    Authors: Kento Kawaharazuka, Kei Tsuzuki, Moritaka Onitsuka, Yuki Asano, Kei Okada, Koji Kawasaki, Masayuki Inaba

    Abstract: The flexible under-actuated musculoskeletal hand is superior in its adaptability and impact resistance. On the other hand, since the relationship between sensors and actuators cannot be uniquely determined, almost all its controls are based on feedforward controls. When gras** and using a tool, the contact state of the hand gradually changes due to the inertia of the tool or impact of action, an… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted at ICRA2020

  2. Musculoskeletal AutoEncoder: A Unified Online Acquisition Method of Intersensory Networks for State Estimation, Control, and Simulation of Musculoskeletal Humanoids

    Authors: Kento Kawaharazuka, Kei Tsuzuki, Moritaka Onitsuka, Yuki Asano, Kei Okada, Koji Kawasaki, Masayuki Inaba

    Abstract: While the musculoskeletal humanoid has various biomimetic benefits, the modeling of its complex structure is difficult, and many learning-based systems have been developed so far. There are various methods, such as control methods using acquired relationships between joints and muscles represented by a data table or neural network, and state estimation methods using Extended Kalman Filter or table… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted at IEEE Robotics and Automation Letters

  3. arXiv:2406.12658  [pdf, other

    cs.CV cs.LG

    Federated Learning with a Single Shared Image

    Authors: Sunny Soni, Aaqib Saeed, Yuki M. Asano

    Abstract: Federated Learning (FL) enables multiple machines to collaboratively train a machine learning model without sharing of private training data. Yet, especially for heterogeneous models, a key bottleneck remains the transfer of knowledge gained from each client model with the server. One popular method, FedDF, uses distillation to tackle this task with the use of a common, shared dataset on which pre… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 8 Pages, 3 Figures, Appendix 4 Pages, CVPRW 2024

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 7782-7790

  4. Toward Autonomous Driving by Musculoskeletal Humanoids: A Study of Developed Hardware and Learning-Based Software

    Authors: Kento Kawaharazuka, Kei Tsuzuki, Yuya Koga, Yusuke Omura, Tasuku Makabe, Koki Shinjo, Moritaka Onitsuka, Yuya Nagamatsu, Yuki Asano, Kei Okada, Koji Kawasaki, Masayuki Inaba

    Abstract: This paper summarizes an autonomous driving project by musculoskeletal humanoids. The musculoskeletal humanoid, which mimics the human body in detail, has redundant sensors and a flexible body structure. These characteristics are suitable for motions with complex environmental contact, and the robot is expected to sit down on the car seat, step on the acceleration and brake pedals, and operate the… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: Accepted at IEEE Robotics and Automation Magazine

  5. arXiv:2405.17423  [pdf, other

    cs.CV cs.CL

    Privacy-Aware Visual Language Models

    Authors: Laurens Samson, Nimrod Barazani, Sennay Ghebreab, Yuki M. Asano

    Abstract: This paper aims to advance our understanding of how Visual Language Models (VLMs) handle privacy-sensitive information, a crucial concern as these technologies become integral to everyday life. To this end, we introduce a new benchmark PrivBench, which contains images from 8 sensitive categories such as passports, or fingerprints. We evaluate 10 state-of-the-art VLMs on this benchmark and observe… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: preprint

  6. arXiv:2405.14862  [pdf, other

    cs.CL

    Bitune: Bidirectional Instruction-Tuning

    Authors: Dawid J. Kopiczko, Tijmen Blankevoort, Yuki M. Asano

    Abstract: We introduce Bitune, a method that improves instruction-tuning of pretrained decoder-only large language models, leading to consistent gains on downstream tasks. Bitune applies both causal and bidirectional attention to the prompt, to obtain a better representation of the query or instruction. We realize this by introducing two sets of parameters, for which we apply parameter-efficient finetuning… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  7. arXiv:2405.11092  [pdf, other

    cs.HC cs.RO

    What metrics of participation balance predict outcomes of collaborative learning with a robot?

    Authors: Yuya Asano, Diane Litman, Quentin King-Shepard, Tristan Maidment, Tyree Langley, Teresa Davison, Timothy Nokes-Malach, Adriana Kovashka, Erin Walker

    Abstract: One of the keys to the success of collaborative learning is balanced participation by all learners, but this does not always happen naturally. Pedagogical robots have the potential to facilitate balance. However, it remains unclear what participation balance robots should aim at; various metrics have been proposed, but it is still an open question whether we should balance human participation in h… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: To appear in Seventeenth International Conference on Educational Data Mining (EDM 2024)

  8. arXiv:2404.17202  [pdf, other

    cs.CV

    Self-supervised visual learning in the low-data regime: a comparative evaluation

    Authors: Sotirios Konstantakos, Despina Ioanna Chalkiadaki, Ioannis Mademlis, Yuki M. Asano, Efstratios Gavves, Georgios Th. Papadopoulos

    Abstract: Self-Supervised Learning (SSL) is a valuable and robust training methodology for contemporary Deep Neural Networks (DNNs), enabling unsupervised pretraining on a `pretext task' that does not require ground-truth labels/annotation. This allows efficient representation learning from massive amounts of unlabeled training data, which in turn leads to increased accuracy in a `downstream task' by exploi… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  9. A Method of Joint Angle Estimation Using Only Relative Changes in Muscle Lengths for Tendon-driven Humanoids with Complex Musculoskeletal Structures

    Authors: Kento Kawaharazuka, Shogo Makino, Masaya Kawamura, Yuki Asano, Kei Okada, Masayuki Inaba

    Abstract: Tendon-driven musculoskeletal humanoids typically have complex structures similar to those of human beings, such as ball joints and the scapula, in which encoders cannot be installed. Therefore, joint angles cannot be directly obtained and need to be estimated using the changes in muscle lengths. In previous studies, methods using table-search and extended kalman filter have been developed. These… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Accepted at Humanoids2018

  10. TWIMP: Two-Wheel Inverted Musculoskeletal Pendulum as a Learning Control Platform in the Real World with Environmental Physical Contact

    Authors: Kento Kawaharazuka, Tasuku Makabe, Shogo Makino, Kei Tsuzuki, Yuya Nagamatsu, Yuki Asano, Takuma Shirai, Fumihito Sugai, Kei Okada, Koji Kawasaki, Masayuki Inaba

    Abstract: By the recent spread of machine learning in the robotics field, a humanoid that can act, perceive, and learn in the real world through contact with the environment needs to be developed. In this study, as one of the choices, we propose a novel humanoid TWIMP, which combines a human mimetic musculoskeletal upper limb with a two-wheel inverted pendulum. By combining the benefit of a musculoskeletal… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Accepted at Humanoids2018

  11. arXiv:2404.14045  [pdf, ps, other

    hep-th gr-qc hep-lat

    Defining the type IIB matrix model without breaking Lorentz symmetry

    Authors: Yuhma Asano, Jun Nishimura, Worapat Piensuk, Naoyuki Yamamori

    Abstract: The type IIB matrix model is a promising nonperturbative formulation of superstring theory, which may elucidate the emergence of (3+1)-dimensional space-time. However, the partition function is divergent due to the Lorentz symmetry, which is represented by a noncompact group. This divergence has been regularized conventionally by introducing some infrared cutoff, which breaks the Lorentz symmetry.… ▽ More

    Submitted 10 June, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 4 pages, no figure, (v2) minor changes with improved presentation

    Report number: UTHEP-787, KEK-TH-2617

  12. arXiv:2404.13381  [pdf, other

    cs.LG cs.CR cs.MA q-bio.PE

    DNA: Differentially private Neural Augmentation for contact tracing

    Authors: Rob Romijnders, Christos Louizos, Yuki M. Asano, Max Welling

    Abstract: The COVID19 pandemic had enormous economic and societal consequences. Contact tracing is an effective way to reduce infection rates by detecting potential virus carriers early. However, this was not generally adopted in the recent pandemic, and privacy concerns are cited as the most important reason. We substantially improve the privacy guarantees of the current state of the art in decentralized c… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: Privacy Regulation and Protection in Machine Learning Workshop at ICLR 2024

  13. Online Learning of Joint-Muscle Map** Using Vision in Tendon-driven Musculoskeletal Humanoids

    Authors: Kento Kawaharazuka, Shogo Makino, Masaya Kawamura, Yuki Asano, Kei Okada, Masayuki Inaba

    Abstract: The body structures of tendon-driven musculoskeletal humanoids are complex, and accurate modeling is difficult, because they are made by imitating the body structures of human beings. For this reason, we have not been able to move them accurately like ordinary humanoids driven by actuators in each axis, and large internal muscle tension and slack of tendon wires have emerged by the model error bet… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted at IEEE Robotics and Automation Letters, 2018

  14. Long-time Self-body Image Acquisition and its Application to the Control of Musculoskeletal Structures

    Authors: Kento Kawaharazuka, Kei Tsuzuki, Shogo Makino, Moritaka Onitsuka, Yuki Asano, Kei Okada, Koji Kawasaki, Masayuki Inaba

    Abstract: The tendon-driven musculoskeletal humanoid has many benefits that human beings have, but the modeling of its complex muscle and bone structures is difficult and conventional model-based controls cannot realize intended movements. Therefore, a learning control mechanism that acquires nonlinear relationships between joint angles, muscle tensions, and muscle lengths from the actual robot is necessary… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted at IEEE Robotics and Automation Letters, 2019

  15. Online Self-body Image Acquisition Considering Changes in Muscle Routes Caused by Softness of Body Tissue for Tendon-driven Musculoskeletal Humanoids

    Authors: Kento Kawaharazuka, Shogo Makino, Masaya Kawamura, Ayaka Fujii, Yuki Asano, Kei Okada, Masayuki Inaba

    Abstract: Tendon-driven musculoskeletal humanoids have many benefits in terms of the flexible spine, multiple degrees of freedom, and variable stiffness. At the same time, because of its body complexity, there are problems in controllability. First, due to the large difference between the actual robot and its geometric model, it cannot move as intended and large internal muscle tension may emerge. Second, m… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted at IROS2018

  16. Development of Musculoskeletal Legs with Planar Interskeletal Structures to Realize Human Comparable Moving Function

    Authors: Moritaka Onitsuka, Manabu Nishiura, Kento Kawaharazuka, Kei Tsuzuki, Yasunori Toshimitsu, Yusuke Omura, Yuki Asano, Kei Okada, Koji Kawasaki, Masayuki Inaba

    Abstract: Musculoskeletal humanoids have been developed by imitating humans and expected to perform natural and dynamic motions as well as humans. To achieve desired motions stably in current musculoskeletal humanoids is not easy because they cannot maintain the sufficient moment arm of muscles in various postures. In this research, we discuss planar structures that spread across joint structures such as li… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: accepted at Humanoids2020

  17. High-Power, Flexible, Robust Hand: Development of Musculoskeletal Hand Using Machined Springs and Realization of Self-Weight Supporting Motion with Humanoid

    Authors: Shogo Makino, Kento Kawaharazuka, Masaya Kawamura, Yuki Asano, Kei Okada, Masayuki Inaba

    Abstract: Human can not only support their body during standing or walking, but also support them by hand, so that they can dangle a bar and others. But most humanoid robots support their body only in the foot and they use their hand just to manipulate objects because their hands are too weak to support their body. Strong hands are supposed to enable humanoid robots to act in much broader scene. Therefore,… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: accepted at IROS2017

  18. Five-fingered Hand with Wide Range of Thumb Using Combination of Machined Springs and Variable Stiffness Joints

    Authors: Shogo Makino, Kento Kawaharazuka, Ayaka Fujii, Masaya Kawamura, Tasuku Makabe, Moritaka Onitsuka, Yuki Asano, Kei Okada, Koji Kawasaki, Masayuki Inaba

    Abstract: Human hands can not only grasp objects of various shape and size and manipulate them in hands but also exert such a large grip** force that they can support the body in the situations such as dangling a bar and climbing a ladder. On the other hand, it is difficult for most robot hands to manage both. Therefore in this paper we developed the hand which can grasp various objects and exert large gr… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: accepted at IROS2018

  19. arXiv:2403.01502  [pdf, ps, other

    cond-mat.supr-con

    Oscillating-charged Andreev Bound States and Their Appearance in UTe$_2$

    Authors: Satoshi Ando, Shingo Kobayashi, Andreas P. Schnyder, Yasuhiro Asano, Satoshi Ikegaya

    Abstract: In a superconductor with a sublattice degree of freedom, we find unconventional Andreev bound states whose charge density oscillates in sign between the two sublattices. The appearance of these oscillating-charged Andreev bound states is characterized by a Zak phase, rather than a conventional topological invariant. In contrast to conventional Andreev bound states, for oscillating-charged Andreev… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

  20. arXiv:2402.16844  [pdf, other

    cs.LG cs.AI cs.CL

    Think Big, Generate Quick: LLM-to-SLM for Fast Autoregressive Decoding

    Authors: Benjamin Bergner, Andrii Skliar, Amelie Royer, Tijmen Blankevoort, Yuki Asano, Babak Ehteshami Bejnordi

    Abstract: Large language models (LLMs) have become ubiquitous in practice and are widely used for generation tasks such as translation, summarization and instruction following. However, their enormous size and reliance on autoregressive decoding increase deployment costs and complicate their use in latency-critical applications. In this work, we propose a hybrid approach that combines language models of dif… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  21. arXiv:2402.14957  [pdf, other

    cs.CV cs.LG

    The Common Stability Mechanism behind most Self-Supervised Learning Approaches

    Authors: Abhishek Jha, Matthew B. Blaschko, Yuki M. Asano, Tinne Tuytelaars

    Abstract: Last couple of years have witnessed a tremendous progress in self-supervised learning (SSL), the success of which can be attributed to the introduction of useful inductive biases in the learning process to learn meaningful visual representations while avoiding collapse. These inductive biases and constraints manifest themselves in the form of different optimization formulations in the SSL techniqu… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: Additional visualizations (.gif): https://github.com/abskjha/CenterVectorSSL

  22. arXiv:2402.08657  [pdf, other

    cs.CV

    PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs

    Authors: Michael Dorkenwald, Nimrod Barazani, Cees G. M. Snoek, Yuki M. Asano

    Abstract: Vision-Language Models (VLMs), such as Flamingo and GPT-4V, have shown immense potential by integrating large language models with vision systems. Nevertheless, these models face challenges in the fundamental computer vision task of object localisation, due to their training on multimodal data containing mostly captions without explicit spatial grounding. While it is possible to construct custom,… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  23. arXiv:2401.11485  [pdf, other

    cs.CV cs.GR eess.IV

    ColorVideoVDP: A visual difference predictor for image, video and display distortions

    Authors: Rafal K. Mantiuk, Param Hanji, Maliha Ashraf, Yuta Asano, Alexandre Chapiro

    Abstract: ColorVideoVDP is a video and image quality metric that models spatial and temporal aspects of vision, for both luminance and color. The metric is built on novel psychophysical models of chromatic spatiotemporal contrast sensitivity and cross-channel contrast masking. It accounts for the viewing conditions, geometric, and photometric characteristics of the display. It was trained to predict common… ▽ More

    Submitted 2 July, 2024; v1 submitted 21 January, 2024; originally announced January 2024.

    Comments: 28 pages

    Journal ref: SIGGRAPH 2024 Technical Papers, Article 129

  24. arXiv:2401.05735  [pdf, other

    cs.CV cs.LG

    Object-Centric Diffusion for Efficient Video Editing

    Authors: Kumara Kahatapitiya, Adil Karjauv, Davide Abati, Fatih Porikli, Yuki M. Asano, Amirhossein Habibian

    Abstract: Diffusion-based video editing have reached impressive quality and can transform either the global style, local structure, and attributes of given video inputs, following textual edit prompts. However, such solutions typically incur heavy memory and computational costs to generate temporally-coherent frames, either in the form of diffusion inversion and/or cross-frame attention. In this paper, we c… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  25. arXiv:2312.17244  [pdf, other

    cs.LG cs.CL

    The LLM Surgeon

    Authors: Tycho F. A. van der Ouderaa, Markus Nagel, Mart van Baalen, Yuki M. Asano, Tijmen Blankevoort

    Abstract: State-of-the-art language models are becoming increasingly large in an effort to achieve the highest performance on large corpora of available textual data. However, the sheer size of the Transformer architectures makes it difficult to deploy models within computational, environmental or device-specific constraints. We explore data-driven compression of existing pretrained models as an alternative… ▽ More

    Submitted 20 March, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

  26. arXiv:2312.11581  [pdf, other

    cs.CR cs.AI cs.LG

    Protect Your Score: Contact Tracing With Differential Privacy Guarantees

    Authors: Rob Romijnders, Christos Louizos, Yuki M. Asano, Max Welling

    Abstract: The pandemic in 2020 and 2021 had enormous economic and societal consequences, and studies show that contact tracing algorithms can be key in the early containment of the virus. While large strides have been made towards more effective contact tracing algorithms, we argue that privacy concerns currently hold deployment back. The essence of a contact tracing algorithm constitutes the communication… ▽ More

    Submitted 15 February, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: Accepted to The 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)

  27. arXiv:2312.08895  [pdf, other

    cs.CV

    Motion Flow Matching for Human Motion Synthesis and Editing

    Authors: Vincent Tao Hu, Wenzhe Yin, **chuan Ma, Yunlu Chen, Basura Fernando, Yuki M Asano, Efstratios Gavves, Pascal Mettes, Bjorn Ommer, Cees G. M. Snoek

    Abstract: Human motion synthesis is a fundamental task in computer animation. Recent methods based on diffusion models or GPT structure demonstrate commendable performance but exhibit drawbacks in terms of slow sampling speeds and error accumulation. In this paper, we propose \emph{Motion Flow Matching}, a novel generative model designed for human motion generation featuring efficient sampling and effective… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: WIP

  28. arXiv:2312.08892  [pdf, other

    cs.CV

    VaLID: Variable-Length Input Diffusion for Novel View Synthesis

    Authors: Shijie Li, Farhad G. Zanjani, Haitam Ben Yahia, Yuki M. Asano, Juergen Gall, Amirhossein Habibian

    Abstract: Novel View Synthesis (NVS), which tries to produce a realistic image at the target view given source view images and their corresponding poses, is a fundamental problem in 3D Vision. As this task is heavily under-constrained, some recent work, like Zero123, tries to solve this problem with generative modeling, specifically using pre-trained diffusion models. Although this strategy generalizes well… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: paper and supplementary material

  29. arXiv:2312.08825  [pdf, other

    cs.CV

    Guided Diffusion from Self-Supervised Diffusion Features

    Authors: Vincent Tao Hu, Yunlu Chen, Mathilde Caron, Yuki M. Asano, Cees G. M. Snoek, Bjorn Ommer

    Abstract: Guidance serves as a key concept in diffusion models, yet its effectiveness is often limited by the need for extra data annotation or classifier pretraining. That is why guidance was harnessed from self-supervised learning backbones, like DINO. However, recent studies have revealed that the feature representation derived from diffusion model itself is discriminative for numerous downstream tasks a… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: Work In Progress

  30. arXiv:2312.04539  [pdf, other

    cs.CV

    Auto-Vocabulary Semantic Segmentation

    Authors: Osman Ülger, Maksymilian Kulicki, Yuki Asano, Martin R. Oswald

    Abstract: Open-ended image understanding tasks gained significant attention from the research community, particularly with the emergence of Vision-Language Models. Open-Vocabulary Segmentation (OVS) methods are capable of performing semantic segmentation without relying on a fixed vocabulary, and in some cases, they operate without the need for training or fine-tuning. However, OVS methods typically require… ▽ More

    Submitted 20 March, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

  31. arXiv:2311.17299  [pdf, other

    cs.LG cs.CV cs.DC

    Federated Fine-Tuning of Foundation Models via Probabilistic Masking

    Authors: Vasileios Tsouvalas, Yuki Asano, Aaqib Saeed

    Abstract: Foundation Models (FMs) have revolutionized machine learning with their adaptability and high performance across tasks; yet, their integration into Federated Learning (FL) is challenging due to substantial communication overhead from their extensive parameterization. Current communication-efficient FL strategies, such as gradient compression, reduce bitrates to around $1$ bit-per-parameter (bpp).… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

    Comments: 19 pages, 9 figures

  32. arXiv:2310.11454  [pdf, other

    cs.CL

    VeRA: Vector-based Random Matrix Adaptation

    Authors: Dawid J. Kopiczko, Tijmen Blankevoort, Yuki M. Asano

    Abstract: Low-rank adapation (LoRA) is a popular method that reduces the number of trainable parameters when finetuning large language models, but still faces acute storage challenges when scaling to even larger models or deploying numerous per-user or per-task adapted models. In this work, we present Vector-based Random Matrix Adaptation (VeRA), which significantly reduces the number of trainable parameter… ▽ More

    Submitted 16 January, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: Accepted at ICLR 2024, website: https://dkopi.github.io/vera

  33. arXiv:2310.08584  [pdf, other

    cs.CV

    Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video

    Authors: Shashanka Venkataramanan, Mamshad Nayeem Rizve, João Carreira, Yuki M. Asano, Yannis Avrithis

    Abstract: Self-supervised learning has unlocked the potential of scaling up pretraining to billions of images, since annotation is unnecessary. But are we making the best use of data? How more economical can we be? In this work, we attempt to answer this question by making two contributions. First, we investigate first-person videos and introduce a "Walking Tours" dataset. These videos are high-resolution,… ▽ More

    Submitted 23 May, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: Accepted to ICLR 2024 (Best paper honorable mention). Project Page: https://shashankvkt.github.io/dora

  34. arXiv:2310.00500  [pdf, other

    cs.CV

    Self-Supervised Open-Ended Classification with Small Visual Language Models

    Authors: Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring, Yuki M. Asano

    Abstract: We present Self-Context Adaptation (SeCAt), a self-supervised approach that unlocks few-shot abilities for open-ended classification with small visual language models. Our approach imitates image captions in a self-supervised way based on clustering a large pool of images followed by assigning semantically-unrelated names to clusters. By doing so, we construct a training signal consisting of inter… ▽ More

    Submitted 6 December, 2023; v1 submitted 30 September, 2023; originally announced October 2023.

  35. arXiv:2308.11796  [pdf, other

    cs.CV

    Time Does Tell: Self-Supervised Time-Tuning of Dense Image Representations

    Authors: Mohammadreza Salehi, Efstratios Gavves, Cees G. M. Snoek, Yuki M. Asano

    Abstract: Spatially dense self-supervised learning is a rapidly growing problem domain with promising applications for unsupervised segmentation and pretraining for dense downstream tasks. Despite the abundance of temporal data in the form of videos, this information-rich source has been largely overlooked. Our paper aims to address this gap by proposing a novel approach that incorporates temporal consisten… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

  36. arXiv:2308.07350  [pdf, other

    cs.LG cs.AI

    Efficient Neural PDE-Solvers using Quantization Aware Training

    Authors: Winfried van den Dool, Tijmen Blankevoort, Max Welling, Yuki M. Asano

    Abstract: In the past years, the application of neural networks as an alternative to classical numerical methods to solve Partial Differential Equations has emerged as a potential paradigm shift in this century-old mathematical field. However, in terms of practical applicability, computational cost remains a substantial bottleneck. Classical approaches try to mitigate this challenge by limiting the spatial… ▽ More

    Submitted 14 August, 2023; originally announced August 2023.

    Comments: Accepted at the ICCV 2023 Workshop on Resource Efficient Deep Learning for Computer Vision

  37. arXiv:2308.02211  [pdf, ps, other

    cond-mat.supr-con

    Discontinuous Transition to Superconducting Phase

    Authors: Takumi Sato, Shingo Kobayashi, Yasuhiro Asano

    Abstract: We discuss the instability of uniform superconducting states that contain the pairing correlations belonging to the odd-frequency symmetry class. The instability originates from the paramagnetic response of odd-frequency Cooper pairs and is considerable at finite temperatures. As a result, the pair potential varies discontinuously at the transition temperature when the amplitude of the odd-frequen… ▽ More

    Submitted 1 July, 2024; v1 submitted 4 August, 2023; originally announced August 2023.

    Comments: 16 pages, 2 figure

  38. arXiv:2307.08727  [pdf, other

    cs.CV

    Learning to Count without Annotations

    Authors: Lukas Knobel, Tengda Han, Yuki M. Asano

    Abstract: While recent supervised methods for reference-based object counting continue to improve the performance on benchmark datasets, they have to rely on small datasets due to the cost associated with manually annotating dozens of objects in images. We propose UnCounTR, a model that can learn this task without requiring any manual annotations. To this end, we construct "Self-Collages", images with vario… ▽ More

    Submitted 29 March, 2024; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: Accepted at CVPR'24. Code available at https://github.com/lukasknobel/SelfCollages

  39. arXiv:2306.13291  [pdf, ps, other

    cond-mat.supr-con cond-mat.mes-hall

    Multi-locational Majorana Zero Modes

    Authors: Yutaro Nagae, Andreas P. Schnyder, Yukio Tanaka, Yasuhiro Asano, Satoshi Ikegaya

    Abstract: We show the appearance of an unconventional Majorana zero mode whose wave function splits into multiple parts located at different ends of different topological superconductors, hereinafter referred to as a multi-locational Majorana zero mode. Specifically, we discuss the multi-locational Majorana zero modes in a three-terminal Josephson junction consisting of topological superconductors, which fo… ▽ More

    Submitted 11 March, 2024; v1 submitted 23 June, 2023; originally announced June 2023.

    Comments: 6+6 pages, 2+2 figures

  40. arXiv:2306.09643  [pdf, other

    cs.LG cs.AI stat.ME

    BISCUIT: Causal Representation Learning from Binary Interactions

    Authors: Phillip Lippe, Sara Magliacane, Sindy Löwe, Yuki M. Asano, Taco Cohen, Efstratios Gavves

    Abstract: Identifying the causal variables of an environment and how to intervene on them is of core value in applications such as robotics and embodied AI. While an agent can commonly interact with the environment and may implicitly perturb the behavior of some of these causal variables, often the targets it affects remain unknown. In this paper, we show that causal variables can still be identified for ma… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: Published in: Uncertainty in Artificial Intelligence (UAI 2023). Project page: https://phlippe.github.io/BISCUIT/

  41. arXiv:2306.07302  [pdf, other

    cs.HC cs.AI cs.CL

    Impact of Experiencing Misrecognition by Teachable Agents on Learning and Rapport

    Authors: Yuya Asano, Diane Litman, Mingzhi Yu, Nikki Lobczowski, Timothy Nokes-Malach, Adriana Kovashka, Erin Walker

    Abstract: While speech-enabled teachable agents have some advantages over ty**-based ones, they are vulnerable to errors stemming from misrecognition by automatic speech recognition (ASR). These errors may propagate, resulting in unexpected changes in the flow of conversation. We analyzed how such changes are linked with learning gains and learners' rapport with the agents. Our results show they are not r… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    Comments: Accepted to AIED 2023

  42. Fulde-Ferrell-Larkin-Ovchinnikov state in a superconducting thin film attached to a ferromagnetic cluster

    Authors: Shu-Ichiro Suzuki, Takumi Sato, Alexander A. Golubov, Yasuhiro Asano

    Abstract: We study theoretically the Fulde-Ferrell-Larkin-Ovchinnikov (FFLO) states appearing locally in a superconducting thin film with a small circular magnetic cluster. The pair potential, the pairing correlations, the free-energy density, and the quasiparticle density of states are calculated for several cluster sizes and the exchange potentials by solving the Eilenberger equation in two dimensions. Th… ▽ More

    Submitted 27 July, 2023; v1 submitted 10 May, 2023; originally announced May 2023.

    Comments: 9 pages, 6 figures

    Journal ref: Phys. Rev. B 108, 064509 (2023)

  43. arXiv:2304.08102  [pdf, other

    cond-mat.supr-con cond-mat.mes-hall

    Supercurrent reversal in Zeeman-split Josephson junctions

    Authors: Shu-Ichiro Suzuki, Yasuhiro Asano, Alexander A. Golubov

    Abstract: We study theoretically the shape of the current-phase relation in a Josephson junction comprising the Zeeman-split superconductors (ZSs) and a normal metal (N). We show that at low temperatures the Josephson current in the ZS/N/ZS junctions exhibits an additional reversal in direction at a certain phase difference $\varphi_c \in (0, π)$. Calculating the spectral Josephson current, the band-splitti… ▽ More

    Submitted 16 June, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: 8 pages, 10 figures

    Journal ref: Phys. Rev. B 108, 144505 (2023)

  44. arXiv:2304.00961  [pdf, other

    cs.CV

    Self-Ordering Point Clouds

    Authors: Pengwan Yang, Cees G. M. Snoek, Yuki M. Asano

    Abstract: In this paper we address the task of finding representative subsets of points in a 3D point cloud by means of a point-wise ordering. Only a few works have tried to address this challenging vision problem, all with the help of hard to obtain point and cloud labels. Different from these works, we introduce the task of point-wise ordering in 3D point clouds through self-supervision, which we call sel… ▽ More

    Submitted 10 April, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

  45. arXiv:2303.01008  [pdf, other

    hep-lat hep-th

    The dynamics of zero modes in lattice gauge theory---difference between SU(2) and SU(3) in 4D

    Authors: Yuhma Asano, Jun Nishimura

    Abstract: The dynamics of zero modes in gauge theory is highly nontrivial due to its nonperturbative nature even in the case where the other modes can be treated perturbatively. One of the related issues concerns the possible instability of the trivial vacuum $A_μ(x)=0$ due to the existence of nontrivial degenerate vacua known as "torons". Here we investigate this issue for the 4D SU(2) and SU(3) pure Yang-… ▽ More

    Submitted 7 March, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

    Comments: 17 pages, 30 figures; v2: references added and typos corrected

    Report number: UTHEP-778, KEK-TH-2500

  46. arXiv:2302.00353  [pdf, other

    cs.LG cs.CV

    Towards Label-Efficient Incremental Learning: A Survey

    Authors: Mert Kilickaya, Joost van de Weijer, Yuki M. Asano

    Abstract: The current dominant paradigm when building a machine learning model is to iterate over a dataset over and over until convergence. Such an approach is non-incremental, as it assumes access to all images of all categories at once. However, for many applications, non-incremental learning is unrealistic. To that end, researchers study incremental learning, where a learner is required to adapt to an i… ▽ More

    Submitted 11 February, 2023; v1 submitted 1 February, 2023; originally announced February 2023.

  47. arXiv:2301.02240  [pdf, other

    cs.CV

    Skip-Attention: Improving Vision Transformers by Paying Less Attention

    Authors: Shashanka Venkataramanan, Amir Ghodrati, Yuki M. Asano, Fatih Porikli, Amirhossein Habibian

    Abstract: This work aims to improve the efficiency of vision transformers (ViT). While ViTs use computationally expensive self-attention operations in every layer, we identify that these operations are highly correlated across layers -- a key redundancy that causes unnecessary computations. Based on this observation, we propose SkipAt, a method to reuse self-attention computation from preceding layers to ap… ▽ More

    Submitted 17 January, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

  48. On the existence of the NS5-brane limit of the plane wave matrix model

    Authors: Yuhma Asano, Goro Ishiki, Takaki Matsumoto, Shinji Shimasaki, Hiromasa Watanabe

    Abstract: We consider a double scaling limit of the plane wave matrix model (PWMM), in which the gravity dual geometry of PWMM reduces to a class of spherical NS5-brane solutions. We identify the form of the scaling limit for the dual geometry of PWMM around a general vacuum and then translate the limit into the field theoretic language. We also show that the limit indeed exists at least in a certain planar… ▽ More

    Submitted 7 May, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

    Comments: 35 pages, 7 figures; v2: a reference added; v3: some explanations and expressions elaborated

    Report number: DIAS-STP-22-17, UTHEP-776, YITP-22-132

    Journal ref: PTEP 2023 (2023) 4, 043B01

  49. arXiv:2211.03085  [pdf, other

    cond-mat.supr-con cond-mat.mes-hall

    Nuclear spin relaxation rate of nonunitary Dirac and Weyl superconductors

    Authors: Koki Maeno, Yuki Kawaguchi, Yasuhiro Asano, Shingo Kobayashi

    Abstract: Nonunitary superconductivity has attracted renewed interest as a novel gapless phase of matter. In this study, we investigate the superconducting gap structure of nonunitary odd-parity chiral pairing states in a superconductor involving strong spin-orbit interactions. By applying a group theoretical classification of chiral states in terms of discrete rotation symmetry, we categorized all possible… ▽ More

    Submitted 6 November, 2022; originally announced November 2022.

    Comments: 18 pages, 4 figures

  50. arXiv:2210.10820  [pdf, other

    cs.CV cs.CL cs.IR cs.LG

    VTC: Improving Video-Text Retrieval with User Comments

    Authors: Laura Hanu, James Thewlis, Yuki M. Asano, Christian Rupprecht

    Abstract: Multi-modal retrieval is an important problem for many applications, such as recommendation and search. Current benchmarks and even datasets are often manually constructed and consist of mostly clean samples where all modalities are well-correlated with the content. Thus, current video-text retrieval literature largely focuses on video titles or audio transcripts, while ignoring user comments, sin… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: Accepted paper at the European Conference on Computer Vision (ECCV) 2022