Skip to main content

Showing 1–32 of 32 results for author: Codella, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.12852  [pdf, other

    eess.IV cs.CV

    Generative Enhancement for 3D Medical Images

    Authors: Lingting Zhu, Noel Codella, Dongdong Chen, Zhenchao **, Lu Yuan, Lequan Yu

    Abstract: The limited availability of 3D medical image datasets, due to privacy concerns and high collection or annotation costs, poses significant challenges in the field of medical imaging. While a promising alternative is the use of synthesized medical data, there are few solutions for realistic 3D medical image synthesis due to difficulties in backbone design and fewer 3D training samples compared to 2D… ▽ More

    Submitted 24 May, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: 20 pages, 8 figures

  2. arXiv:2401.10815  [pdf, other

    cs.CV

    RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision

    Authors: Fernando Pérez-García, Harshita Sharma, Sam Bond-Taylor, Kenza Bouzid, Valentina Salvatelli, Maximilian Ilse, Shruthi Bannur, Daniel C. Castro, Anton Schwaighofer, Matthew P. Lungren, Maria Wetscherek, Noel Codella, Stephanie L. Hyland, Javier Alvarez-Valle, Ozan Oktay

    Abstract: Language-supervised pre-training has proven to be a valuable method for extracting semantically meaningful features from images, serving as a foundational element in multimodal systems within the computer vision and medical imaging domains. However, resulting features are limited by the information contained within the text. This is particularly problematic in medical imaging, where radiologists'… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  3. arXiv:2311.15562  [pdf, other

    cs.CV

    Fully Authentic Visual Question Answering Dataset from Online Communities

    Authors: Chongyan Chen, Mengchen Liu, Noel Codella, Yunsheng Li, Lu Yuan, Danna Gurari

    Abstract: Visual Question Answering (VQA) entails answering questions about images. We introduce the first VQA dataset in which all contents originate from an authentic use case. Sourced from online question answering community forums, we call it VQAonline. We characterize this dataset and how it relates to eight mainstream VQA datasets. Observing that answers in our dataset tend to be much longer (i.e., a… ▽ More

    Submitted 18 March, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

  4. arXiv:2311.13752  [pdf, other

    cs.CV cs.AI

    3D-MIR: A Benchmark and Empirical Study on 3D Medical Image Retrieval in Radiology

    Authors: Asma Ben Abacha, Alberto Santamaria-Pang, Ho Hin Lee, Jameson Merkow, Qin Cai, Surya Teja Devarakonda, Abdullah Islam, Julia Gong, Matthew P. Lungren, Thomas Lin, Noel C Codella, Ivan Tarapov

    Abstract: The increasing use of medical imaging in healthcare settings presents a significant challenge due to the increasing workload for radiologists, yet it also offers opportunity for enhancing healthcare outcomes if effectively leveraged. 3D image retrieval holds potential to reduce radiologist workloads by enabling clinicians to efficiently search through diagnostically similar or otherwise relevant c… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

  5. arXiv:2311.13668  [pdf, other

    cs.CL cs.AI cs.CV

    MAIRA-1: A specialised large multimodal model for radiology report generation

    Authors: Stephanie L. Hyland, Shruthi Bannur, Kenza Bouzid, Daniel C. Castro, Mercy Ranjit, Anton Schwaighofer, Fernando Pérez-García, Valentina Salvatelli, Shaury Srivastav, Anja Thieme, Noel Codella, Matthew P. Lungren, Maria Teodora Wetscherek, Ozan Oktay, Javier Alvarez-Valle

    Abstract: We present a radiology-specific multimodal model for the task for generating radiological reports from chest X-rays (CXRs). Our work builds on the idea that large language model(s) can be equipped with multimodal capabilities through alignment with pre-trained vision encoders. On natural images, this has been shown to allow multimodal models to gain image understanding and description capabilities… ▽ More

    Submitted 26 April, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

    Comments: 18 pages, 9 tables, 5 figures. v2 adds test IDs and image encoder citation. v3 fixes error in NPV/specificity

  6. arXiv:2310.14670  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.MM

    Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond

    Authors: Zhecan Wang, Long Chen, Haoxuan You, Keyang Xu, Yicheng He, Wenhao Li, Noel Codella, Kai-Wei Chang, Shih-Fu Chang

    Abstract: Vision-language (VL) understanding tasks evaluate models' comprehension of complex visual scenes through multiple-choice questions. However, we have identified two dataset biases that models can exploit as shortcuts to resolve various VL tasks correctly without proper understanding. The first type of dataset bias is \emph{Unbalanced Matching} bias, where the correct answer overlaps the question an… ▽ More

    Submitted 31 October, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023

    Journal ref: EMNLP 2023

  7. arXiv:2307.00862  [pdf, other

    cs.CV cs.CL

    UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding

    Authors: Rui Sun, Zhecan Wang, Haoxuan You, Noel Codella, Kai-Wei Chang, Shih-Fu Chang

    Abstract: Vision-language tasks, such as VQA, SNLI-VE, and VCR are challenging because they require the model's reasoning ability to understand the semantics of the visual world and natural language. Supervised methods working for vision-language tasks have been well-studied. However, solving these tasks in a zero-shot setting is less explored. Since Contrastive Language-Image Pre-training (CLIP) has shown… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

    Comments: 14 pages, 4 figures, ACL 2023 Findings

  8. arXiv:2305.12311  [pdf, other

    cs.CL cs.AI cs.CV cs.LG eess.AS

    i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data

    Authors: Ziyi Yang, Mahmoud Khademi, Yichong Xu, Reid Pryzant, Yuwei Fang, Chenguang Zhu, Dongdong Chen, Yao Qian, Mei Gao, Yi-Ling Chen, Robert Gmyr, Naoyuki Kanda, Noel Codella, Bin Xiao, Yu Shi, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang

    Abstract: The convergence of text, visual, and audio data is a key step towards human-like artificial intelligence, however the current Vision-Language-Speech landscape is dominated by encoder-only models which lack generative abilities. We propose closing this gap with i-Code V2, the first model capable of generating natural language from any combination of Vision, Language, and Speech data. i-Code V2 is a… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

  9. arXiv:2303.17228  [pdf, other

    cs.CV

    Streaming Video Model

    Authors: Yucheng Zhao, Chong Luo, Chuanxin Tang, Dongdong Chen, Noel Codella, Zheng-Jun Zha

    Abstract: Video understanding tasks have traditionally been modeled by two separate architectures, specially tailored for two distinct tasks. Sequence-based video tasks, such as action recognition, use a video backbone to directly extract spatiotemporal features, while frame-based video tasks, such as multiple object tracking (MOT), rely on single fixed-image backbone to extract spatial features. In contras… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

    Comments: Accepted by CVPR'23

  10. arXiv:2207.12661  [pdf, other

    cs.CV cs.CL

    Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training

    Authors: Haoxuan You, Luowei Zhou, Bin Xiao, Noel Codella, Yu Cheng, Ruochen Xu, Shih-Fu Chang, Lu Yuan

    Abstract: Large-scale multi-modal contrastive pre-training has demonstrated great utility to learn transferable features for a range of downstream tasks by map** multiple modalities into a shared embedding space. Typically, this has employed separate encoders for each modality. However, recent work suggests that transformers can support learning across multiple modalities and allow knowledge sharing. Insp… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

    Comments: Accepted by ECCV 2022, 22 pages, 4 figures

  11. arXiv:2205.01818  [pdf, other

    cs.LG cs.AI cs.CL cs.CV eess.AS

    i-Code: An Integrative and Composable Multimodal Learning Framework

    Authors: Ziyi Yang, Yuwei Fang, Chenguang Zhu, Reid Pryzant, Dongdong Chen, Yu Shi, Yichong Xu, Yao Qian, Mei Gao, Yi-Ling Chen, Liyang Lu, Yujia Xie, Robert Gmyr, Noel Codella, Naoyuki Kanda, Bin Xiao, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang

    Abstract: Human intelligence is multimodal; we integrate visual, linguistic, and acoustic signals to maintain a holistic worldview. Most current pretraining methods, however, are limited to one or two modalities. We present i-Code, a self-supervised pretraining framework where users may flexibly combine the modalities of vision, speech, and language into unified and general-purpose vector representations. I… ▽ More

    Submitted 5 May, 2022; v1 submitted 3 May, 2022; originally announced May 2022.

  12. arXiv:2204.10496  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.MM

    Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks

    Authors: Zhecan Wang, Noel Codella, Yen-Chun Chen, Luowei Zhou, Xiyang Dai, Bin Xiao, Jianwei Yang, Haoxuan You, Kai-Wei Chang, Shih-fu Chang, Lu Yuan

    Abstract: Cross-modal encoders for vision-language (VL) tasks are often pretrained with carefully curated vision-language datasets. While these datasets reach an order of 10 million samples, the labor cost is prohibitive to scale further. Conversely, unimodal encoders are pretrained with simpler annotations that are less cost-prohibitive, achieving scales of hundreds of millions to billions. As a result, un… ▽ More

    Submitted 28 April, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2201.05729

  13. arXiv:2204.03645  [pdf, other

    cs.CV

    DaViT: Dual Attention Vision Transformers

    Authors: Mingyu Ding, Bin Xiao, Noel Codella, ** Luo, **gdong Wang, Lu Yuan

    Abstract: In this work, we introduce Dual Attention Vision Transformers (DaViT), a simple yet effective vision transformer architecture that is able to capture global context while maintaining computational efficiency. We propose approaching the problem from an orthogonal angle: exploiting self-attention mechanisms with both "spatial tokens" and "channel tokens". With spatial tokens, the spatial dimension d… ▽ More

    Submitted 7 April, 2022; originally announced April 2022.

  14. arXiv:2201.05729  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.MM

    CLIP-TD: CLIP Targeted Distillation for Vision-Language Tasks

    Authors: Zhecan Wang, Noel Codella, Yen-Chun Chen, Luowei Zhou, Jianwei Yang, Xiyang Dai, Bin Xiao, Haoxuan You, Shih-Fu Chang, Lu Yuan

    Abstract: Contrastive language-image pretraining (CLIP) links vision and language modalities into a unified embedding space, yielding the tremendous potential for vision-language (VL) tasks. While early concurrent works have begun to study this potential on a subset of tasks, important questions remain: 1) What is the benefit of CLIP on unstudied VL tasks? 2) Does CLIP provide benefit in low-shot or domain-… ▽ More

    Submitted 28 December, 2022; v1 submitted 14 January, 2022; originally announced January 2022.

    Comments: This paper is greatly modified and updated to be re-submitted to another conference. The new paper is under the name "Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks", https://doi.org/10.48550/arXiv.2204.10496

  15. arXiv:2112.09106  [pdf, other

    cs.CV cs.AI cs.LG

    RegionCLIP: Region-based Language-Image Pretraining

    Authors: Yiwu Zhong, Jianwei Yang, Pengchuan Zhang, Chunyuan Li, Noel Codella, Liunian Harold Li, Luowei Zhou, Xiyang Dai, Lu Yuan, Yin Li, Jianfeng Gao

    Abstract: Contrastive language-image pretraining (CLIP) using image-text pairs has achieved impressive results on image classification in both zero-shot and transfer learning settings. However, we show that directly applying such models to recognize image regions for object detection leads to poor performance due to a domain shift: CLIP was trained to match an image as a whole to a text description, without… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

    Comments: Technical report

  16. arXiv:2111.11432  [pdf, other

    cs.CV cs.AI cs.LG

    Florence: A New Foundation Model for Computer Vision

    Authors: Lu Yuan, Dongdong Chen, Yi-Ling Chen, Noel Codella, Xiyang Dai, Jianfeng Gao, Houdong Hu, Xuedong Huang, Boxin Li, Chunyuan Li, Ce Liu, Mengchen Liu, Zicheng Liu, Yumao Lu, Yu Shi, Lijuan Wang, Jianfeng Wang, Bin Xiao, Zhen Xiao, Jianwei Yang, Michael Zeng, Luowei Zhou, Pengchuan Zhang

    Abstract: Automated visual understanding of our diverse and open world demands computer vision models to generalize well with minimal customization for specific tasks, similar to human vision. Computer vision foundation models, which are trained on diverse, large-scale dataset and can be adapted to a wide range of downstream tasks, are critical for this mission to solve real-world computer vision applicatio… ▽ More

    Submitted 22 November, 2021; originally announced November 2021.

  17. arXiv:2103.15808  [pdf, other

    cs.CV

    CvT: Introducing Convolutions to Vision Transformers

    Authors: Hai** Wu, Bin Xiao, Noel Codella, Mengchen Liu, Xiyang Dai, Lu Yuan, Lei Zhang

    Abstract: We present in this paper a new architecture, named Convolutional vision Transformer (CvT), that improves Vision Transformer (ViT) in performance and efficiency by introducing convolutions into ViT to yield the best of both designs. This is accomplished through two primary modifications: a hierarchy of Transformers containing a new convolutional token embedding, and a convolutional Transformer bloc… ▽ More

    Submitted 29 March, 2021; originally announced March 2021.

  18. arXiv:2008.07360  [pdf

    eess.IV cs.CV cs.CY physics.med-ph

    A Patient-Centric Dataset of Images and Metadata for Identifying Melanomas Using Clinical Context

    Authors: Veronica Rotemberg, Nicholas Kurtansky, Brigid Betz-Stablein, Liam Caffery, Emmanouil Chousakos, Noel Codella, Marc Combalia, Stephen Dusza, Pascale Guitera, David Gutman, Allan Halpern, Harald Kittler, Kivanc Kose, Steve Langer, Konstantinos Lioprys, Josep Malvehy, Shenara Musthaq, Jabpani Nanda, Ofer Reiter, George Shih, Alexander Stratigos, Philipp Tschandl, Jochen Weber, H. Peter Soyer

    Abstract: Prior skin image datasets have not addressed patient-level information obtained from multiple skin lesions from the same patient. Though artificial intelligence classification algorithms have achieved expert-level performance in controlled studies examining single images, in practice dermatologists base their judgment holistically from multiple lesions on the same patient. The 2020 SIIM-ISIC Melan… ▽ More

    Submitted 7 August, 2020; originally announced August 2020.

    Comments: Figures: 3, Tables: 2, Pages: 12

  19. arXiv:1912.07200  [pdf, other

    cs.CV cs.LG

    A Broader Study of Cross-Domain Few-Shot Learning

    Authors: Yunhui Guo, Noel C. Codella, Leonid Karlinsky, James V. Codella, John R. Smith, Kate Saenko, Tajana Rosing, Rogerio Feris

    Abstract: Recent progress on few-shot learning largely relies on annotated data for meta-learning: base classes sampled from the same domain as the novel classes. However, in many applications, collecting data for meta-learning is infeasible or impossible. This leads to the cross-domain few-shot learning problem, where there is a large shift between base and novel class domains. While investigations of the… ▽ More

    Submitted 17 July, 2020; v1 submitted 16 December, 2019; originally announced December 2019.

    Comments: ECCV 2020. Website: https://www.learning-with-limited-labels.com/

  20. arXiv:1910.13268  [pdf, other

    cs.CV cs.CY stat.ML

    Estimating Skin Tone and Effects on Classification Performance in Dermatology Datasets

    Authors: Newton M. Kinyanjui, Timothy Odonga, Celia Cintas, Noel C. F. Codella, Rameswar Panda, Prasanna Sattigeri, Kush R. Varshney

    Abstract: Recent advances in computer vision and deep learning have led to breakthroughs in the development of automated skin image analysis. In particular, skin cancer classification models have achieved performance higher than trained expert dermatologists. However, no attempt has been made to evaluate the consistency in performance of machine learning models across populations with varying skin tones. In… ▽ More

    Submitted 29 October, 2019; originally announced October 2019.

    Comments: NeurIPS 2019 Workshop on Fair ML for Health

  21. arXiv:1908.07630  [pdf, other

    cs.LG cs.AI cs.CV

    P2L: Predicting Transfer Learning for Images and Semantic Relations

    Authors: Bishwaranjan Bhattacharjee, John R. Kender, Matthew Hill, Parijat Dube, Siyu Huo, Michael R. Glass, Brian Belgodere, Sharath Pankanti, Noel Codella, Patrick Watson

    Abstract: Transfer learning enhances learning across tasks, by leveraging previously learned representations -- if they are properly chosen. We describe an efficient method to accurately estimate the appropriateness of a previously trained model for use in a new learning task. We use this measure, which we call "Predict To Learn" ("P2L"), in the two very different domains of images and semantic relations, w… ▽ More

    Submitted 15 October, 2020; v1 submitted 20 August, 2019; originally announced August 2019.

    Comments: 10 pages, 8 figures, 4 tables

  22. arXiv:1908.02288  [pdf, other

    eess.IV cs.CV

    BCN20000: Dermoscopic Lesions in the Wild

    Authors: Marc Combalia, Noel C. F. Codella, Veronica Rotemberg, Brian Helba, Veronica Vilaplana, Ofer Reiter, Cristina Carrera, Alicia Barreiro, Allan C. Halpern, Susana Puig, Josep Malvehy

    Abstract: This article summarizes the BCN20000 dataset, composed of 19424 dermoscopic images of skin lesions captured from 2010 to 2016 in the facilities of the Hospital Clínic in Barcelona. With this dataset, we aim to study the problem of unconstrained classification of dermoscopic images of skin cancer, including lesions found in hard-to-diagnose locations (nails and mucosa), large lesions which do not f… ▽ More

    Submitted 30 August, 2019; v1 submitted 6 August, 2019; originally announced August 2019.

    Comments: Abstract for BCN20000

  23. arXiv:1906.02299  [pdf, other

    cs.LG cs.AI stat.ML

    Teaching AI to Explain its Decisions Using Embeddings and Multi-Task Learning

    Authors: Noel C. F. Codella, Michael Hind, Karthikeyan Natesan Ramamurthy, Murray Campbell, Amit Dhurandhar, Kush R. Varshney, Dennis Wei, Aleksandra Mojsilović

    Abstract: Using machine learning in high-stakes applications often requires predictions to be accompanied by explanations comprehensible to the domain user, who has ultimate responsibility for decisions and outcomes. Recently, a new framework for providing explanations, called TED, has been proposed to provide meaningful explanations for predictions. This framework augments training data to include explanat… ▽ More

    Submitted 5 June, 2019; originally announced June 2019.

    Comments: presented at 2019 ICML Workshop on Human in the Loop Learning (HILL 2019), Long Beach, USA. arXiv admin note: substantial text overlap with arXiv:1805.11648

  24. arXiv:1902.03368  [pdf, other

    cs.CV

    Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC)

    Authors: Noel Codella, Veronica Rotemberg, Philipp Tschandl, M. Emre Celebi, Stephen Dusza, David Gutman, Brian Helba, Aadi Kalloo, Konstantinos Liopyris, Michael Marchetti, Harald Kittler, Allan Halpern

    Abstract: This work summarizes the results of the largest skin image analysis challenge in the world, hosted by the International Skin Imaging Collaboration (ISIC), a global partnership that has organized the world's largest public repository of dermoscopic images of skin. The challenge was hosted in 2018 at the Medical Image Computing and Computer Assisted Intervention (MICCAI) conference in Granada, Spain… ▽ More

    Submitted 29 March, 2019; v1 submitted 8 February, 2019; originally announced February 2019.

    Comments: https://challenge2018.isic-archive.com/

  25. arXiv:1811.04896  [pdf, other

    cs.AI

    TED: Teaching AI to Explain its Decisions

    Authors: Michael Hind, Dennis Wei, Murray Campbell, Noel C. F. Codella, Amit Dhurandhar, Aleksandra Mojsilović, Karthikeyan Natesan Ramamurthy, Kush R. Varshney

    Abstract: Artificial intelligence systems are being increasingly deployed due to their potential to increase the efficiency, scale, consistency, fairness, and accuracy of decisions. However, as many of these systems are opaque in their operation, there is a growing demand for such systems to provide explanations for their decisions. Conventional approaches to this problem attempt to expose or discover the i… ▽ More

    Submitted 15 June, 2019; v1 submitted 12 November, 2018; originally announced November 2018.

    Comments: This article leverages some content from arXiv:1805.11648; presented at ACM/AAAI AIES'19

  26. arXiv:1805.12234  [pdf, other

    cs.CV

    Collaborative Human-AI (CHAI): Evidence-Based Interpretable Melanoma Classification in Dermoscopic Images

    Authors: Noel C. F. Codella, Chung-Ching Lin, Allan Halpern, Michael Hind, Rogerio Feris, John R. Smith

    Abstract: Automated dermoscopic image analysis has witnessed rapid growth in diagnostic performance. Yet adoption faces resistance, in part, because no evidence is provided to support decisions. In this work, an approach for evidence-based classification is presented. A feature embedding is learned with CNNs, triplet-loss, and global average pooling, and used to classify via kNN search. Evidence is provided… ▽ More

    Submitted 1 August, 2018; v1 submitted 30 May, 2018; originally announced May 2018.

    Comments: Presented at MICCAI 2018, Workshop on Interpretability of Machine Intelligence in Medical Image Computing (IMIMIC): https://imimic.bitbucket.io

  27. arXiv:1805.11648  [pdf, other

    cs.AI

    Teaching Meaningful Explanations

    Authors: Noel C. F. Codella, Michael Hind, Karthikeyan Natesan Ramamurthy, Murray Campbell, Amit Dhurandhar, Kush R. Varshney, Dennis Wei, Aleksandra Mojsilovic

    Abstract: The adoption of machine learning in high-stakes applications such as healthcare and law has lagged in part because predictions are not accompanied by explanations comprehensible to the domain user, who often holds the ultimate responsibility for decisions and outcomes. In this paper, we propose an approach to generate such explanations in which training data is augmented to include, in addition to… ▽ More

    Submitted 10 September, 2018; v1 submitted 29 May, 2018; originally announced May 2018.

    Comments: 9 pages

  28. arXiv:1804.05944  [pdf, other

    cs.CV

    Segmentation of both Diseased and Healthy Skin from Clinical Photographs in a Primary Care Setting

    Authors: Noel C. F. Codella, Daren Anderson, Tyler Philips, Anthony Porto, Kevin Massey, Jane Snowdon, Rogerio Feris, John Smith

    Abstract: This work presents the first segmentation study of both diseased and healthy skin in standard camera photographs from a clinical environment. Challenges arise from varied lighting conditions, skin types, backgrounds, and pathological states. For study, 400 clinical photographs (with skin segmentation masks) representing various pathological states of skin are retrospectively collected from a prima… ▽ More

    Submitted 17 April, 2018; v1 submitted 16 April, 2018; originally announced April 2018.

    Comments: Accepted to IEEE EMBC 2018

  29. arXiv:1710.05006  [pdf, other

    cs.CV

    Skin Lesion Analysis Toward Melanoma Detection: A Challenge at the 2017 International Symposium on Biomedical Imaging (ISBI), Hosted by the International Skin Imaging Collaboration (ISIC)

    Authors: Noel C. F. Codella, David Gutman, M. Emre Celebi, Brian Helba, Michael A. Marchetti, Stephen W. Dusza, Aadi Kalloo, Konstantinos Liopyris, Nabin Mishra, Harald Kittler, Allan Halpern

    Abstract: This article describes the design, implementation, and results of the latest installment of the dermoscopic image analysis benchmark challenge. The goal is to support research and development of algorithms for automated diagnosis of melanoma, the most lethal skin cancer. The challenge was divided into 3 tasks: lesion segmentation, feature detection, and disease classification. Participation involv… ▽ More

    Submitted 8 January, 2018; v1 submitted 13 October, 2017; originally announced October 2017.

  30. arXiv:1610.04662  [pdf

    cs.CV

    Deep Learning Ensembles for Melanoma Recognition in Dermoscopy Images

    Authors: Noel Codella, Quoc-Bao Nguyen, Sharath Pankanti, David Gutman, Brian Helba, Allan Halpern, John R. Smith

    Abstract: Melanoma is the deadliest form of skin cancer. While curable with early detection, only highly trained specialists are capable of accurately recognizing the disease. As expertise is in limited supply, automated systems capable of identifying disease could save lives, reduce unnecessary biopsies, and reduce costs. Toward this goal, we propose a system that combines recent developments in deep learn… ▽ More

    Submitted 17 October, 2016; v1 submitted 14 October, 2016; originally announced October 2016.

    Comments: URL for the IBM Journal of Research and Development: http://www.research.ibm.com/journal/

    Journal ref: IBM Journal of Research and Development, vol. 61, no. 4/5, 2017

  31. arXiv:1605.01397  [pdf, other

    cs.CV

    Skin Lesion Analysis toward Melanoma Detection: A Challenge at the International Symposium on Biomedical Imaging (ISBI) 2016, hosted by the International Skin Imaging Collaboration (ISIC)

    Authors: David Gutman, Noel C. F. Codella, Emre Celebi, Brian Helba, Michael Marchetti, Nabin Mishra, Allan Halpern

    Abstract: In this article, we describe the design and implementation of a publicly accessible dermatology image analysis benchmark challenge. The goal of the challenge is to sup- port research and development of algorithms for automated diagnosis of melanoma, a lethal form of skin cancer, from dermoscopic images. The challenge was divided into sub-challenges for each task involved in image analysis, includi… ▽ More

    Submitted 4 May, 2016; originally announced May 2016.

  32. arXiv:1511.06448  [pdf, other

    cs.LG cs.CV

    Learning Representations from EEG with Deep Recurrent-Convolutional Neural Networks

    Authors: Pouya Bashivan, Irina Rish, Mohammed Yeasin, Noel Codella

    Abstract: One of the challenges in modeling cognitive events from electroencephalogram (EEG) data is finding representations that are invariant to inter- and intra-subject differences, as well as to inherent noise associated with such data. Herein, we propose a novel approach for learning such representations from multi-channel EEG time-series, and demonstrate its advantages in the context of mental load cl… ▽ More

    Submitted 29 February, 2016; v1 submitted 19 November, 2015; originally announced November 2015.

    Comments: To be published as a conference paper at ICLR 2016