Skip to main content

Showing 1–50 of 53 results for author: Kasai, S

Searching in archive cs. Search in all archives.
.
  1. Attack on Scene Flow using Point Clouds

    Authors: Haniyeh Ehsani Oskouie, Mohammad-Shahram Moin, Shohreh Kasaei

    Abstract: Deep neural networks have made significant advancements in accurately estimating scene flow using point clouds, which is vital for many applications like video analysis, action recognition, and navigation. The robustness of these techniques, however, remains a concern, particularly in the face of adversarial attacks that have been proven to deceive state-of-the-art deep neural networks in many dom… ▽ More

    Submitted 17 June, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

  2. arXiv:2404.11335  [pdf, other

    cs.CV cs.AI cs.LG

    SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap

    Authors: Vladimir Somers, Victor Joos, Anthony Cioppa, Silvio Giancola, Seyed Abolfazl Ghasemzadeh, Floriane Magera, Baptiste Standaert, Amir Mohammad Mansourian, Xin Zhou, Shohreh Kasaei, Bernard Ghanem, Alexandre Alahi, Marc Van Droogenbroeck, Christophe De Vleeschouwer

    Abstract: Tracking and identifying athletes on the pitch holds a central role in collecting essential insights from the game, such as estimating the total distance covered by players or understanding team tactics. This tracking and identification process is crucial for reconstructing the game state, defined by the athletes' positions and identities on a 2D top-view of the pitch, (i.e. a minimap). However, r… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  3. arXiv:2403.05451  [pdf, other

    cs.CV

    Attention-guided Feature Distillation for Semantic Segmentation

    Authors: Amir M. Mansourian, Arya Jalali, Rozhan Ahmadi, Shohreh Kasaei

    Abstract: In contrast to existing complex methodologies commonly employed for distilling knowledge from a teacher to a student, the pro-posed method showcases the efficacy of a simple yet powerful method for utilizing refined feature maps to transfer attention. The proposed method has proven to be effective in distilling rich information, outperforming existing methods in semantic segmentation as a dense pr… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 17 pages, 8 figures, and 3 tables

  4. arXiv:2402.02474  [pdf, other

    cs.CV

    Deep Spectral Improvement for Unsupervised Image Instance Segmentation

    Authors: Farnoosh Arefi, Amir M. Mansourian, Shohreh Kasaei

    Abstract: Deep spectral methods reframe the image decomposition process as a graph partitioning task by extracting features using self-supervised learning and utilizing the Laplacian of the affinity matrix to obtain eigensegments. However, instance segmentation has received less attention compared to other tasks within the context of deep spectral methods. This paper addresses the fact that not all channels… ▽ More

    Submitted 6 February, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

    Comments: 11 pages, 13 figures and 5 tables

  5. arXiv:2401.17828  [pdf, other

    cs.CV cs.AI

    Leveraging Swin Transformer for Local-to-Global Weakly Supervised Semantic Segmentation

    Authors: Rozhan Ahmadi, Shohreh Kasaei

    Abstract: In recent years, weakly supervised semantic segmentation using image-level labels as supervision has received significant attention in the field of computer vision. Most existing methods have addressed the challenges arising from the lack of spatial information in these labels by focusing on facilitating supervised learning through the generation of pseudo-labels from class activation maps (CAMs).… ▽ More

    Submitted 11 March, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

    Comments: 7 pages, 4 figures, 3 tables

  6. Multi-task Learning for Joint Re-identification, Team Affiliation, and Role Classification for Sports Visual Tracking

    Authors: Amir M. Mansourian, Vladimir Somers, Christophe De Vleeschouwer, Shohreh Kasaei

    Abstract: Effective tracking and re-identification of players is essential for analyzing soccer videos. But, it is a challenging task due to the non-linear motion of players, the similarity in appearance of players from the same team, and frequent occlusions. Therefore, the ability to extract meaningful embeddings to represent players is crucial in develo** an effective tracking and re-identification syst… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Journal ref: Proceedings of the 6th International Workshop on Multimedia Content Analysis in Sports (MMSports 2023), October 29, 2023, Ottawa, ON, Canada

  7. arXiv:2401.00833  [pdf, other

    cs.CV

    Rethinking RAFT for Efficient Optical Flow

    Authors: Navid Eslami, Farnoosh Arefi, Amir M. Mansourian, Shohreh Kasaei

    Abstract: Despite significant progress in deep learning-based optical flow methods, accurately estimating large displacements and repetitive patterns remains a challenge. The limitations of local features and similarity search patterns used in these algorithms contribute to this issue. Additionally, some existing methods suffer from slow runtime and excessive graphic memory consumption. To address these pro… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

    Comments: 7 pages, 5 figures, 4 tables

    MSC Class: ACM-class: F.2.2; I.2.7

  8. arXiv:2401.00496  [pdf, other

    cs.CV cs.AI cs.LG

    SAR-RARP50: Segmentation of surgical instrumentation and Action Recognition on Robot-Assisted Radical Prostatectomy Challenge

    Authors: Dimitrios Psychogyios, Emanuele Colleoni, Beatrice Van Amsterdam, Chih-Yang Li, Shu-Yu Huang, Yuchong Li, Fucang Jia, Baosheng Zou, Guotai Wang, Yang Liu, Maxence Boels, Jiayu Huo, Rachel Sparks, Prokar Dasgupta, Alejandro Granados, Sebastien Ourselin, Mengya Xu, An Wang, Yanan Wu, Long Bai, Hongliang Ren, Atsushi Yamada, Yuriko Harai, Yuto Ishikawa, Kazuyuki Hayashi , et al. (25 additional authors not shown)

    Abstract: Surgical tool segmentation and action recognition are fundamental building blocks in many computer-assisted intervention applications, ranging from surgical skills assessment to decision support systems. Nowadays, learning-based action recognition and segmentation approaches outperform classical methods, relying, however, on large, annotated datasets. Furthermore, action recognition and tool segme… ▽ More

    Submitted 23 January, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

  9. Domain generalization across tumor types, laboratories, and species -- insights from the 2022 edition of the Mitosis Domain Generalization Challenge

    Authors: Marc Aubreville, Nikolas Stathonikos, Taryn A. Donovan, Robert Klopfleisch, Jonathan Ganz, Jonas Ammeling, Frauke Wilm, Mitko Veta, Samir Jabari, Markus Eckstein, Jonas Annuscheit, Christian Krumnow, Engin Bozaba, Sercan Cayir, Hongyan Gu, Xiang 'Anthony' Chen, Mostafa Jahanifar, Adam Shephard, Satoshi Kondo, Satoshi Kasai, Sujatha Kotte, VG Saipradeep, Maxime W. Lafarge, Viktor H. Koelzer, Ziyue Wang , et al. (5 additional authors not shown)

    Abstract: Recognition of mitotic figures in histologic tumor specimens is highly relevant to patient outcome assessment. This task is challenging for algorithms and human experts alike, with deterioration of algorithmic performance under shifts in image representations. Considerable covariate shifts occur when assessment is performed on different tumor types, images are acquired using different digitization… ▽ More

    Submitted 31 January, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

    Journal ref: Medical Image Analysis Volume 94, May 2024, 103155

  10. arXiv:2308.04243  [pdf, other

    cs.CV

    AICSD: Adaptive Inter-Class Similarity Distillation for Semantic Segmentation

    Authors: Amir M. Mansourian, Rozhan Ahmadi, Shohreh Kasaei

    Abstract: In recent years, deep neural networks have achieved remarkable accuracy in computer vision tasks. With inference time being a crucial factor, particularly in dense prediction tasks such as semantic segmentation, knowledge distillation has emerged as a successful technique for improving the accuracy of lightweight student networks. The existing methods often neglect the information in channels and… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: 10 pages, 5 figures, 5 tables

  11. The ACROBAT 2022 Challenge: Automatic Registration Of Breast Cancer Tissue

    Authors: Philippe Weitz, Masi Valkonen, Leslie Solorzano, Circe Carr, Kimmo Kartasalo, Constance Boissin, Sonja Koivukoski, Aino Kuusela, Dusan Rasic, Yanbo Feng, Sandra Sinius Pouplier, Abhinav Sharma, Kajsa Ledesma Eriksson, Stephanie Robertson, Christian Marzahl, Chandler D. Gatenbee, Alexander R. A. Anderson, Marek Wodzinski, Artur Jurgas, Niccolò Marini, Manfredo Atzori, Henning Müller, Daniel Budelmann, Nick Weiss, Stefan Heldmann , et al. (16 additional authors not shown)

    Abstract: The alignment of tissue between histopathological whole-slide-images (WSI) is crucial for research and clinical applications. Advances in computing, deep learning, and availability of large WSI datasets have revolutionised WSI analysis. Therefore, the current state-of-the-art in WSI registration is unclear. To address this, we conducted the ACROBAT challenge, based on the largest WSI registration… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

  12. arXiv:2305.07152  [pdf, other

    cs.CV

    Surgical tool classification and localization: results and methods from the MICCAI 2022 SurgToolLoc challenge

    Authors: Aneeq Zia, Kiran Bhattacharyya, Xi Liu, Max Berniker, Ziheng Wang, Rogerio Nespolo, Satoshi Kondo, Satoshi Kasai, Kousuke Hirasawa, Bo Liu, David Austin, Yiheng Wang, Michal Futrega, Jean-Francois Puget, Zhenqiang Li, Yoichi Sato, Ryo Fujii, Ryo Hachiuma, Mana Masuda, Hideo Saito, An Wang, Mengya Xu, Mobarakol Islam, Long Bai, Winnie Pang , et al. (46 additional authors not shown)

    Abstract: The ability to automatically detect and track surgical instruments in endoscopic videos can enable transformational interventions. Assessing surgical performance and efficiency, identifying skilled tool use and choreography, and planning operational and logistical aspects of OR resources are just a few of the applications that could benefit. Unfortunately, obtaining the annotations needed to train… ▽ More

    Submitted 31 May, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

  13. arXiv:2303.06274  [pdf

    cs.CV cs.LG

    CoNIC Challenge: Pushing the Frontiers of Nuclear Detection, Segmentation, Classification and Counting

    Authors: Simon Graham, Quoc Dang Vu, Mostafa Jahanifar, Martin Weigert, Uwe Schmidt, Wenhua Zhang, Jun Zhang, Sen Yang, **xi Xiang, Xiyue Wang, Josef Lorenz Rumberger, Elias Baumann, Peter Hirsch, Lihao Liu, Chenyang Hong, Angelica I. Aviles-Rivero, Ayushi Jain, Heeyoung Ahn, Yiyu Hong, Hussam Azzuni, Min Xu, Mohammad Yaqub, Marie-Claire Blache, Benoît Piégu, Bertrand Vernay , et al. (64 additional authors not shown)

    Abstract: Nuclear detection, segmentation and morphometric profiling are essential in hel** us further understand the relationship between histology and patient outcome. To drive innovation in this area, we setup a community-wide challenge using the largest available dataset of its kind to assess nuclear segmentation and cellular composition. Our challenge, named CoNIC, stimulated the development of repro… ▽ More

    Submitted 14 March, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

  14. arXiv:2302.06294  [pdf, other

    eess.IV cs.CV cs.LG

    CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection

    Authors: Chinedu Innocent Nwoye, Tong Yu, Saurav Sharma, Aditya Murali, Deepak Alapatt, Armine Vardazaryan, Kun Yuan, Jonas Hajek, Wolfgang Reiter, Amine Yamlahi, Finn-Henri Smidt, Xiaoyang Zou, Guoyan Zheng, Bruno Oliveira, Helena R. Torres, Satoshi Kondo, Satoshi Kasai, Felix Holm, Ege Özsoy, Shuangchun Gui, Han Li, Sista Raviteja, Rachana Sathish, Pranav Poudel, Binod Bhattarai , et al. (24 additional authors not shown)

    Abstract: Formalizing surgical activities as triplets of the used instruments, actions performed, and target anatomies is becoming a gold standard approach for surgical activity modeling. The benefit is that this formalization helps to obtain a more detailed understanding of tool-tissue interaction which can be used to develop better Artificial Intelligence assistance for image-guided surgery. Earlier effor… ▽ More

    Submitted 14 July, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

    Comments: MICCAI EndoVis CholecTriplet2022 challenge report. Published at Elsevier journal of Medical Image Analysis. 25 pages, 15 figures, 8 tables

    Journal ref: Medical Image Analysis, Volume 89, 2023, 102888, ISSN 1361-8415

  15. arXiv:2302.01738  [pdf, other

    eess.IV cs.LG

    AIROGS: Artificial Intelligence for RObust Glaucoma Screening Challenge

    Authors: Coen de Vente, Koenraad A. Vermeer, Nicolas Jaccard, He Wang, Hongyi Sun, Firas Khader, Daniel Truhn, Temirgali Aimyshev, Yerkebulan Zhanibekuly, Tien-Dung Le, Adrian Galdran, Miguel Ángel González Ballester, Gustavo Carneiro, Devika R G, Hrishikesh P S, Densen Puthussery, Hong Liu, Zekang Yang, Satoshi Kondo, Satoshi Kasai, Edward Wang, Ashritha Durvasula, Jónathan Heras, Miguel Ángel Zapata, Teresa Araújo , et al. (11 additional authors not shown)

    Abstract: The early detection of glaucoma is essential in preventing visual impairment. Artificial intelligence (AI) can be used to analyze color fundus photographs (CFPs) in a cost-effective manner, making glaucoma screening more accessible. While AI models for glaucoma screening from CFPs have shown promising results in laboratory settings, their performance decreases significantly in real-world scenarios… ▽ More

    Submitted 10 February, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

    Comments: 19 pages, 8 figures, 3 tables

  16. arXiv:2301.10575  [pdf, other

    cs.CV cs.LG eess.IV

    Trainable Loss Weights in Super-Resolution

    Authors: Arash Chaichi Mellatshahi, Shohreh Kasaei

    Abstract: In recent years, limited research has discussed the loss function in the super-resolution process. The majority of those studies have only used perceptual similarity conventionally. This is while the development of appropriate loss can improve the quality of other methods as well. In this article, a new weighting method for pixel-wise loss is proposed. With the help of this method, it is possible… ▽ More

    Submitted 27 November, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

    Comments: 9 pages, 6 figures, 2 table

    MSC Class: 68T07 (Primary) 68T45 (Secondary) ACM Class: I.4

  17. arXiv:2210.14164  [pdf, other

    cs.CV cs.AI cs.CR cs.LG

    No-Box Attacks on 3D Point Cloud Classification

    Authors: Hanieh Naderi, Chinthaka Dinesh, Ivan V. Bajic, Shohreh Kasaei

    Abstract: Adversarial attacks pose serious challenges for deep neural network (DNN)-based analysis of various input signals. In the case of 3D point clouds, methods have been developed to identify points that play a key role in network decision, and these become crucial in generating existing adversarial attacks. For example, a saliency map approach is a popular method for identifying adversarial drop point… ▽ More

    Submitted 27 January, 2024; v1 submitted 19 October, 2022; originally announced October 2022.

    Comments: 10 pages, 6 figures

  18. arXiv:2208.12635  [pdf

    eess.IV cs.CV

    A Two Step Approach for Whole Slide Image Registration

    Authors: Satoshi Kondo, Satoshi Kasai, Kousuke Hirasawa

    Abstract: Multi-stain whole-slide-image (WSI) registration is an active field of research. It is unclear, however, how the current WSI registration methods would perform on a real-world data set. AutomatiC Registration Of Breast cAncer Tissue (ACROBAT) challenge is held to verify the performance of the current WSI registration methods by using a new dataset that originates from routine diagnostics to assess… ▽ More

    Submitted 24 August, 2022; originally announced August 2022.

  19. arXiv:2208.12041  [pdf

    eess.IV cs.CV

    Multi-Modality Abdominal Multi-Organ Segmentation with Deep Supervised 3D Segmentation Model

    Authors: Satoshi Kondo, Satoshi Kasai

    Abstract: To promote the development of medical image segmentation technology, AMOS, a large-scale abdominal multi-organ dataset for versatile medical image segmentation, is provided and AMOS 2022 challenge is held by using the dataset. In this report, we present our solution for the AMOS 2022 challenge. We employ residual U-Net with deep super vision as our base model. The experimental results show that th… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

  20. arXiv:2204.13260  [pdf

    cs.ET cond-mat.mtrl-sci

    Pattern recognition with neuromorphic computing using magnetic-field induced dynamics of skyrmions

    Authors: Tomoyuki Yokouchi, Satoshi Sugimoto, Bivas Rana, Shinichiro Seki, Naoki Ogawa, Yuki Shiomi, Shinya Kasai, Yoshichika Otani

    Abstract: Nonlinear phenomena in physical systems can be used for brain-inspired computing with low energy consumption. Response from the dynamics of a topological spin structure called skyrmion is one of the candidates for such a neuromorphic computing. However, its ability has not been well explored experimentally. Here, we experimentally demonstrate neuromorphic computing using nonlinear response origina… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

  21. arXiv:2202.11944  [pdf, ps, other

    cs.CV

    Computer Aided Diagnosis and Out-of-Distribution Detection in Glaucoma Screening Using Color Fundus Photography

    Authors: Satoshi Kondo, Satoshi Kasai, Kosuke Hirasawa

    Abstract: Artificial Intelligence for RObust Glaucoma Screening (AIROGS) Challenge is held for develo** solutions for glaucoma screening from color fundus photography that are robust to real-world scenarios. This report describes our method submitted to the AIROGS challenge. Our method employs convolutional neural networks to classify input images to "referable glaucoma" or "no referable glaucoma". In add… ▽ More

    Submitted 24 February, 2022; originally announced February 2022.

  22. arXiv:2202.11804  [pdf, ps, other

    eess.IV cs.CV

    Nuclei panoptic segmentation and composition regression with multi-task deep neural networks

    Authors: Satoshi Kondo, Satoshi Kasai

    Abstract: Nuclear segmentation, classification and quantification within Haematoxylin & Eosin stained histology images enables the extraction of interpretable cell-based features that can be used in downstream explainable models in computational pathology. The Colon Nuclei Identification and Counting (CoNIC) Challenge is held to help drive forward research and innovation for automatic nuclei recognition in… ▽ More

    Submitted 23 February, 2022; originally announced February 2022.

  23. arXiv:2202.11287  [pdf, other

    cs.CV cs.CR cs.LG

    LPF-Defense: 3D Adversarial Defense based on Frequency Analysis

    Authors: Hanieh Naderi, Kimia Noorbakhsh, Arian Etemadi, Shohreh Kasaei

    Abstract: Although 3D point cloud classification has recently been widely deployed in different application scenarios, it is still very vulnerable to adversarial attacks. This increases the importance of robust training of 3D models in the face of adversarial attacks. Based on our analysis on the performance of existing adversarial attacks, more adversarial perturbations are found in the mid and high-freque… ▽ More

    Submitted 24 August, 2022; v1 submitted 22 February, 2022; originally announced February 2022.

    Comments: 15 pages, 7 figures

  24. arXiv:2202.07537  [pdf, ps, other

    cs.LG cs.IT stat.ML

    Information-Theoretic Analysis of Minimax Excess Risk

    Authors: Hassan Hafez-Kolahi, Behrad Moniri, Shohreh Kasaei

    Abstract: Two main concepts studied in machine learning theory are generalization gap (difference between train and test error) and excess risk (difference between test error and the minimum possible error). While information-theoretic tools have been used extensively to study the generalization gap of learning algorithms, the information-theoretic nature of excess risk has not yet been fully investigated.… ▽ More

    Submitted 28 February, 2023; v1 submitted 15 February, 2022; originally announced February 2022.

    Comments: Published in the IEEE Transactions on Information Theory

  25. arXiv:2110.03745  [pdf, other

    cs.CV

    Adversarial Attack by Limited Point Cloud Surface Modifications

    Authors: Atrin Arya, Hanieh Naderi, Shohreh Kasaei

    Abstract: Recent research has revealed that the security of deep neural networks that directly process 3D point clouds to classify objects can be threatened by adversarial samples. Although existing adversarial attack methods achieve high success rates, they do not restrict the point modifications enough to preserve the point cloud appearance. To overcome this shortcoming, two constraints are proposed. Thes… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

  26. arXiv:2107.03463  [pdf, other

    cs.CV cs.AI cs.LG

    CHASE: Robust Visual Tracking via Cell-Level Differentiable Neural Architecture Search

    Authors: Seyed Mojtaba Marvasti-Zadeh, Javad Khaghani, Li Cheng, Hossein Ghanei-Yakhdan, Shohreh Kasaei

    Abstract: A strong visual object tracker nowadays relies on its well-crafted modules, which typically consist of manually-designed network architectures to deliver high-quality tracking results. Not surprisingly, the manual design process becomes a particularly challenging barrier, as it demands sufficient prior experience, enormous effort, intuition, and perhaps some good luck. Meanwhile, neural architectu… ▽ More

    Submitted 26 October, 2021; v1 submitted 2 July, 2021; originally announced July 2021.

    Comments: The first two authors contributed equally to this work. Accepted manuscript in BMVC 2021

  27. arXiv:2105.04180  [pdf, other

    cs.LG cs.IT

    Rate-Distortion Analysis of Minimum Excess Risk in Bayesian Learning

    Authors: Hassan Hafez-Kolahi, Behrad Moniri, Shohreh Kasaei, Mahdieh Soleymani Baghshah

    Abstract: In parametric Bayesian learning, a prior is assumed on the parameter $W$ which determines the distribution of samples. In this setting, Minimum Excess Risk (MER) is defined as the difference between the minimum expected loss achievable when learning from data and the minimum expected loss that could be achieved if $W$ was observed. In this paper, we build upon and extend the recent results of (Xu… ▽ More

    Submitted 17 July, 2021; v1 submitted 10 May, 2021; originally announced May 2021.

    Comments: Accepted at ICML 2021

  28. arXiv:2103.07640  [pdf, other

    cs.CV cs.LG

    Generating Unrestricted Adversarial Examples via Three Parameters

    Authors: Hanieh Naderi, Leili Goli, Shohreh Kasaei

    Abstract: Deep neural networks have been shown to be vulnerable to adversarial examples deliberately constructed to misclassify victim models. As most adversarial examples have restricted their perturbations to $L_{p}$-norm, existing defense methods have focused on these types of perturbations and less attention has been paid to unrestricted adversarial examples; which can create more realistic attacks, abl… ▽ More

    Submitted 13 March, 2021; originally announced March 2021.

  29. arXiv:2010.04516  [pdf, other

    cs.CV

    Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge Transfer

    Authors: Mahdi Ghorbani, Fahimeh Fooladgar, Shohreh Kasaei

    Abstract: Deep neural network architectures have attained remarkable improvements in scene understanding tasks. Utilizing an efficient model is one of the most important constraints for limited-resource devices. Recently, several compression methods have been proposed to diminish the heavy computational burden and memory consumption. Among them, the pruning and quantizing methods exhibit a critical drop in… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

    Comments: 11 pages, 4 figures

  30. arXiv:2009.09235  [pdf, other

    cs.CV cs.RO

    Open-Ended Fine-Grained 3D Object Categorization by Combining Shape and Texture Features in Multiple Colorspaces

    Authors: Nils Keunecke, S. Hamidreza Kasaei

    Abstract: As a consequence of an ever-increasing number of service robots, there is a growing demand for highly accurate real-time 3D object recognition. Considering the expansion of robot applications in more complex and dynamic environments,it is evident that it is not possible to pre-program all object categories and anticipate all exceptions in advance. Therefore, robots should have the functionality to… ▽ More

    Submitted 28 May, 2021; v1 submitted 19 September, 2020; originally announced September 2020.

  31. arXiv:2008.13015  [pdf, other

    cs.CV cs.AI

    Adaptive Exploitation of Pre-trained Deep Convolutional Neural Networks for Robust Visual Tracking

    Authors: Seyed Mojtaba Marvasti-Zadeh, Hossein Ghanei-Yakhdan, Shohreh Kasaei

    Abstract: Due to the automatic feature extraction procedure via multi-layer nonlinear transformations, the deep learning-based visual trackers have recently achieved great success in challenging scenarios for visual tracking purposes. Although many of those trackers utilize the feature maps from pre-trained convolutional neural networks (CNNs), the effects of selecting different models and exploiting variou… ▽ More

    Submitted 22 December, 2020; v1 submitted 29 August, 2020; originally announced August 2020.

    Comments: Accepted Manuscript in Multimedia Tools and Applications (MTAP), Springer

  32. arXiv:2007.06866  [pdf, other

    cs.CV

    Alleviating Over-segmentation Errors by Detecting Action Boundaries

    Authors: Yuchi Ishikawa, Seito Kasai, Yoshimitsu Aoki, Hirokatsu Kataoka

    Abstract: We propose an effective framework for the temporal action segmentation task, namely an Action Segment Refinement Framework (ASRF). Our model architecture consists of a long-term feature extractor and two branches: the Action Segmentation Branch (ASB) and the Boundary Regression Branch (BRB). The long-term feature extractor provides shared features for the two branches with a wide temporal receptiv… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

    Comments: under review

  33. arXiv:2006.02597  [pdf, other

    cs.CV eess.IV

    COMET: Context-Aware IoU-Guided Network for Small Object Tracking

    Authors: Seyed Mojtaba Marvasti-Zadeh, Javad Khaghani, Hossein Ghanei-Yakhdan, Shohreh Kasaei, Li Cheng

    Abstract: We consider the problem of tracking an unknown small target from aerial videos of medium to high altitudes. This is a challenging problem, which is even more pronounced in unavoidable scenarios of drastic camera motion and high density. To address this problem, we introduce a context-aware IoU-guided tracker (COMET) that exploits a multitask two-stream network and an offline reference proposal gen… ▽ More

    Submitted 18 September, 2020; v1 submitted 3 June, 2020; originally announced June 2020.

    Comments: Accepted manuscript in ACCV 2020

  34. arXiv:2005.09183  [pdf, other

    cs.CV cs.CL cs.IR

    Retrieving and Highlighting Action with Spatiotemporal Reference

    Authors: Seito Kasai, Yuchi Ishikawa, Masaki Hayashi, Yoshimitsu Aoki, Kensho Hara, Hirokatsu Kataoka

    Abstract: In this paper, we present a framework that jointly retrieves and spatiotemporally highlights actions in videos by enhancing current deep cross-modal retrieval methods. Our work takes on the novel task of action highlighting, which visualizes where and when actions occur in an untrimmed video setting. Action highlighting is a fine-grained task, compared to conventional action recognition tasks whic… ▽ More

    Submitted 18 May, 2020; originally announced May 2020.

    Comments: Accepted to ICIP 2020

  35. arXiv:2004.02933  [pdf, other

    cs.CV cs.LG eess.IV

    Efficient Scale Estimation Methods using Lightweight Deep Convolutional Neural Networks for Visual Tracking

    Authors: Seyed Mojtaba Marvasti-Zadeh, Hossein Ghanei-Yakhdan, Shohreh Kasaei

    Abstract: In recent years, visual tracking methods that are based on discriminative correlation filters (DCF) have been very promising. However, most of these methods suffer from a lack of robust scale estimation skills. Although a wide range of recent DCF-based methods exploit the features that are extracted from deep convolutional neural networks (CNNs) in their translation model, the scale of the visual… ▽ More

    Submitted 11 December, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

    Comments: Accepted Manuscript in Neural Computing and Applications (NCAA), Springer

  36. arXiv:2004.02932  [pdf, other

    cs.CV cs.LG eess.IV

    Beyond Background-Aware Correlation Filters: Adaptive Context Modeling by Hand-Crafted and Deep RGB Features for Visual Tracking

    Authors: Seyed Mojtaba Marvasti-Zadeh, Hossein Ghanei-Yakhdan, Shohreh Kasaei

    Abstract: In recent years, the background-aware correlation filters have achie-ved a lot of research interest in the visual target tracking. However, these methods cannot suitably model the target appearance due to the exploitation of hand-crafted features. On the other hand, the recent deep learning-based visual tracking methods have provided a competitive performance along with extensive computations. In… ▽ More

    Submitted 29 September, 2021; v1 submitted 6 April, 2020; originally announced April 2020.

    Comments: To be appeared in Multimedia Tools and Applications, Springer, 2021

  37. arXiv:2004.01382  [pdf, other

    cs.CV cs.LG eess.IV

    Effective Fusion of Deep Multitasking Representations for Robust Visual Tracking

    Authors: Seyed Mojtaba Marvasti-Zadeh, Hossein Ghanei-Yakhdan, Shohreh Kasaei, Kamal Nasrollahi, Thomas B. Moeslund

    Abstract: Visual object tracking remains an active research field in computer vision due to persisting challenges with various problem-specific factors in real-world scenes. Many existing tracking methods based on discriminative correlation filters (DCFs) employ feature extraction networks (FENs) to model the target appearance during the learning process. However, using deep feature maps extracted from FENs… ▽ More

    Submitted 20 September, 2021; v1 submitted 3 April, 2020; originally announced April 2020.

    Comments: To be appeared in The Visual Computer (International Journal of Computer Graphics), Springer, 2021

  38. arXiv:2003.08151  [pdf, other

    cs.RO cs.CV

    The State of Lifelong Learning in Service Robots: Current Bottlenecks in Object Perception and Manipulation

    Authors: S. Hamidreza Kasaei, Jorik Melsen, Floris van Beers, Christiaan Steenkist, Klemen Voncina

    Abstract: Service robots are appearing more and more in our daily life. The development of service robots combines multiple fields of research, from object perception to object manipulation. The state-of-the-art continues to improve to make a proper coupling between object perception and manipulation. This coupling is necessary for service robots not only to perform various tasks in a reasonable amount of t… ▽ More

    Submitted 6 May, 2021; v1 submitted 18 March, 2020; originally announced March 2020.

  39. arXiv:2002.03892  [pdf, other

    cs.RO

    Learning to Grasp 3D Objects using Deep Residual U-Nets

    Authors: Yikun Li, Lambert Schomaker, S. Hamidreza Kasaei

    Abstract: Grasp synthesis is one of the challenging tasks for any robot object manipulation task. In this paper, we present a new deep learning-based grasp synthesis approach for 3D objects. In particular, we propose an end-to-end 3D Convolutional Neural Network to predict the objects' graspable areas. We named our approach Res-U-Net since the architecture of the network is designed based on U-Net structure… ▽ More

    Submitted 12 September, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

  40. arXiv:2002.03779  [pdf, other

    cs.RO cs.CV

    Investigating the Importance of Shape Features, Color Constancy, Color Spaces and Similarity Measures in Open-Ended 3D Object Recognition

    Authors: S. Hamidreza Kasaei, Maryam Ghorbani, Jits Schilperoort, Wessel van der Rest

    Abstract: Despite the recent success of state-of-the-art 3D object recognition approaches, service robots are frequently failed to recognize many objects in real human-centric environments. For these robots, object recognition is a challenging task due to the high demand for accurate and real-time response under changing and unpredictable environmental conditions. Most of the recent approaches use either th… ▽ More

    Submitted 26 September, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

  41. Lightweight Residual Densely Connected Convolutional Neural Network

    Authors: Fahimeh Fooladgar, Shohreh Kasaei

    Abstract: Extremely efficient convolutional neural network architectures are one of the most important requirements for limited-resource devices (such as embedded and mobile devices). The computing power and memory size are two important constraints of these devices. Recently, some architectures have been proposed to overcome these limitations by considering specific hardware-software equipment. In this pap… ▽ More

    Submitted 8 June, 2020; v1 submitted 2 January, 2020; originally announced January 2020.

  42. arXiv:1912.12082  [pdf, other

    cs.CV

    Pointwise Attention-Based Atrous Convolutional Neural Networks

    Authors: Mobina Mahdavi, Fahimeh Fooladgar, Shohreh Kasaei

    Abstract: With the rapid progress of deep convolutional neural networks, in almost all robotic applications, the availability of 3D point clouds improves the accuracy of 3D semantic segmentation methods. Rendering of these irregular, unstructured, and unordered 3D points to 2D images from multiple viewpoints imposes some issues such as loss of information due to 3D to 2D projection, discretizing artifacts,… ▽ More

    Submitted 27 December, 2019; originally announced December 2019.

    Comments: 7 pages, 6 figures. Author one and author two contributed equally

  43. arXiv:1912.11691  [pdf, other

    cs.CV cs.MM

    Multi-Modal Attention-based Fusion Model for Semantic Segmentation of RGB-Depth Images

    Authors: Fahimeh Fooladgar, Shohreh Kasaei

    Abstract: The 3D scene understanding is mainly considered as a crucial requirement in computer vision and robotics applications. One of the high-level tasks in 3D scene understanding is semantic segmentation of RGB-Depth images. With the availability of RGB-D cameras, it is desired to improve the accuracy of the scene understanding process by exploiting the depth features along with the appearance features.… ▽ More

    Submitted 25 December, 2019; originally announced December 2019.

  44. arXiv:1912.09539  [pdf, other

    cs.RO cs.AI cs.CV

    Interactive Open-Ended Learning for 3D Object Recognition

    Authors: S. Hamidreza Kasaei

    Abstract: The thesis contributes in several important ways to the research area of 3D object category learning and recognition. To cope with the mentioned limitations, we look at human cognition, in particular at the fact that human beings learn to recognize object categories ceaselessly over time. This ability to refine knowledge from the set of accumulated experiences facilitates the adaptation to new env… ▽ More

    Submitted 19 December, 2019; originally announced December 2019.

    Comments: PhD thesis

  45. arXiv:1912.00535  [pdf, other

    cs.CV cs.LG eess.IV

    Deep Learning for Visual Tracking: A Comprehensive Survey

    Authors: Seyed Mojtaba Marvasti-Zadeh, Li Cheng, Hossein Ghanei-Yakhdan, Shohreh Kasaei

    Abstract: Visual target tracking is one of the most sought-after yet challenging research topics in computer vision. Given the ill-posed nature of the problem and its popularity in a broad range of real-world scenarios, a number of large-scale benchmark datasets have been established, on which considerable methods have been developed and demonstrated with significant progress in recent years -- predominantl… ▽ More

    Submitted 26 January, 2021; v1 submitted 1 December, 2019; originally announced December 2019.

    Comments: Accepted Manuscript in IEEE Transactions on Intelligent Transportation Systems

  46. arXiv:1909.09706  [pdf, other

    cs.LG cs.AI cs.IT stat.ML

    Do Compressed Representations Generalize Better?

    Authors: Hassan Hafez-Kolahi, Shohreh Kasaei, Mahdiyeh Soleymani-Baghshah

    Abstract: One of the most studied problems in machine learning is finding reasonable constraints that guarantee the generalization of a learning algorithm. These constraints are usually expressed as some simplicity assumptions on the target. For instance, in the Vapnik-Chervonenkis (VC) theory the space of possible hypotheses is considered to have a limited VC dimension. In this paper, the constraint on the… ▽ More

    Submitted 2 January, 2020; v1 submitted 20 September, 2019; originally announced September 2019.

  47. arXiv:1907.12924  [pdf, other

    cs.CV cs.RO

    Look Further to Recognize Better: Learning Shared Topics and Category-Specific Dictionaries for Open-Ended 3D Object Recognition

    Authors: S. Hamidreza Kasaei

    Abstract: Service robots are expected to operate effectively in human-centric environments for long periods of time. In such realistic scenarios, fine-grained object categorization is as important as basic-level object categorization. We tackle this problem by proposing an open-ended object recognition approach which concurrently learns both the object categories and the local features for encoding objects.… ▽ More

    Submitted 26 July, 2019; originally announced July 2019.

    Comments: arXiv admin note: text overlap with arXiv:1902.03057

  48. arXiv:1907.10932  [pdf, other

    cs.RO

    Object Perception and Gras** in Open-Ended Domains

    Authors: S. Hamidreza Kasaei

    Abstract: Nowadays service robots are leaving the structured and completely known environments and entering human-centric settings. For these robots, object perception and gras** are two challenging tasks due to the high demand for accurate and real-time responses. Although many problems have already been understood and solved successfully, many challenges still remain. Open-ended learning is one of these… ▽ More

    Submitted 25 July, 2019; originally announced July 2019.

  49. arXiv:1904.03743  [pdf, other

    cs.LG cs.IT stat.ML

    Information Bottleneck and its Applications in Deep Learning

    Authors: Hassan Hafez-Kolahi, Shohreh Kasaei

    Abstract: Information Theory (IT) has been used in Machine Learning (ML) from early days of this field. In the last decade, advances in Deep Neural Networks (DNNs) have led to surprising improvements in many applications of ML. The result has been a paradigm shift in the community toward revisiting previous ideas and applications in this new framework. Ideas from IT are no exception. One of the ideas which… ▽ More

    Submitted 7 April, 2019; originally announced April 2019.

  50. arXiv:1904.02530  [pdf, other

    cs.RO

    Interactive Open-Ended Object, Affordance and Grasp Learning for Robotic Manipulation

    Authors: S. Hamidreza Kasaei, Nima Shafii, Luis Seabra Lopes, Ana Maria Tome

    Abstract: Service robots are expected to autonomously and efficiently work in human-centric environments. For this type of robots, object perception and manipulation are challenging tasks due to need for accurate and real-time response. This paper presents an interactive open-ended learning approach to recognize multiple objects and their grasp affordances concurrently. This is an important contribution in… ▽ More

    Submitted 4 April, 2019; originally announced April 2019.