Skip to main content

Showing 1–32 of 32 results for author: Chowdhury, S A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16099  [pdf, other

    cs.SD eess.AS

    Speech Representation Analysis based on Inter- and Intra-Model Similarities

    Authors: Yassine El Kheir, Ahmed Ali, Shammur Absar Chowdhury

    Abstract: Self-supervised models have revolutionized speech processing, achieving new levels of performance in a wide variety of tasks with limited resources. However, the inner workings of these models are still opaque. In this paper, we aim to analyze the encoded contextual representation of these foundation models based on their inter- and intra-model similarity, independent of any external annotation an… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 5 pages, Accepted to appear in ICASSP XAI-SA Workshop

  2. arXiv:2406.13431  [pdf, other

    cs.CL cs.SD eess.AS

    Children's Speech Recognition through Discrete Token Enhancement

    Authors: Vrunda N. Sukhadia, Shammur Absar Chowdhury

    Abstract: Children's speech recognition is considered a low-resource task mainly due to the lack of publicly available data. There are several reasons for such data scarcity, including expensive data collection and annotation processes, and data privacy, among others. Transforming speech signals into discrete tokens that do not carry sensitive information but capture both linguistic and acoustic information… ▽ More

    Submitted 24 June, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted at Interspeech 2024

  3. arXiv:2406.04511  [pdf, other

    cs.CV

    Classification of Non-native Handwritten Characters Using Convolutional Neural Network

    Authors: F. A. Mamun, S. A. H. Chowdhury, J. E. Giti, H. Sarker

    Abstract: The use of convolutional neural networks (CNNs) has accelerated the progress of handwritten character classification/recognition. Handwritten character recognition (HCR) has found applications in various domains, such as traffic signal detection, language translation, and document information extraction. However, the widespread use of existing HCR technology is yet to be seen as it does not provid… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  4. arXiv:2311.10727  [pdf

    cs.CY

    Unveiling the Potential of Big Data Analytics for Transforming Higher Education in Bangladesh; Needs, Prospects, and Challenges

    Authors: Sabbir Ahmed Chowdhury, Md Aminul Islam, Mostafa Azad Kamal

    Abstract: Big Data Analytics has gained tremendous momentum in many sectors worldwide. Big Data has substantial influence in the field of Learning Analytics that may allow academic institutions to better understand the learners needs and proactively address them. Hence, it is essential to understand Big Data and its application. With the capability of Big Data to find a broad understanding of the scientific… ▽ More

    Submitted 24 November, 2023; v1 submitted 10 October, 2023; originally announced November 2023.

  5. arXiv:2311.03196  [pdf, other

    cs.CL cs.AI

    Pseudo-Labeling for Domain-Agnostic Bangla Automatic Speech Recognition

    Authors: Rabindra Nath Nandi, Mehadi Hasan Menon, Tareq Al Muntasir, Sagor Sarker, Quazi Sarwar Muhtaseem, Md. Tariqul Islam, Shammur Absar Chowdhury, Firoj Alam

    Abstract: One of the major challenges for develo** automatic speech recognition (ASR) for low-resource languages is the limited access to labeled data with domain-specific variations. In this study, we propose a pseudo-labeling approach to develop a large-scale domain-agnostic ASR dataset. With the proposed methodology, we developed a 20k+ hours labeled Bangla speech dataset covering diverse topics, speak… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: Accepted at BLP-2023 (at EMNLP 2023), ASR, low-resource, out-of-distribution, domain-agnostic

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

  6. arXiv:2310.13974  [pdf, other

    cs.CL cs.SD eess.AS

    Automatic Pronunciation Assessment -- A Review

    Authors: Yassine El Kheir, Ahmed Ali, Shammur Absar Chowdhury

    Abstract: Pronunciation assessment and its application in computer-aided pronunciation training (CAPT) have seen impressive progress in recent years. With the rapid growth in language processing and deep learning over the past few years, there is a need for an updated review. In this paper, we review methods employed in pronunciation assessment for both phonemic and prosodic. We categorize the main challeng… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Comments: 9 pages, accepted to EMNLP Findings

  7. arXiv:2309.07739  [pdf, other

    cs.CL cs.SD eess.AS

    The complementary roles of non-verbal cues for Robust Pronunciation Assessment

    Authors: Yassine El Kheir, Shammur Absar Chowdhury, Ahmed Ali

    Abstract: Research on pronunciation assessment systems focuses on utilizing phonetic and phonological aspects of non-native (L2) speech, often neglecting the rich layer of information hidden within the non-verbal cues. In this study, we proposed a novel pronunciation assessment framework, IntraVerbalPA. % The framework innovatively incorporates both fine-grained frame- and abstract utterance-level non-verba… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: 5 pages, submitted to ICASSP 2024

  8. arXiv:2309.07719  [pdf, other

    cs.CL cs.SD eess.AS

    L1-aware Multilingual Mispronunciation Detection Framework

    Authors: Yassine El Kheir, Shammur Absar Chowdhury, Ahmed Ali

    Abstract: The phonological discrepancies between a speaker's native (L1) and the non-native language (L2) serves as a major factor for mispronunciation. This paper introduces a novel multilingual MDD architecture, L1-MultiMDD, enriched with L1-aware speech representation. An end-to-end speech encoder is trained on the input signal and its corresponding reference phoneme sequence. First, an attention mechani… ▽ More

    Submitted 21 September, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: 5 papers, submitted to ICASSP 2024

  9. arXiv:2308.15402  [pdf

    cs.HC

    Bornil: An open-source sign language data crowdsourcing platform for AI enabled dialect-agnostic communication

    Authors: Shahriar Elahi Dhruvo, Mohammad Akhlaqur Rahman, Manash Kumar Mandal, Md. Istiak Hossain Shihab, A. A. Noman Ansary, Kaneez Fatema Shithi, Sanjida Khanom, Rabeya Akter, Safaeid Hossain Arib, M. N. Ansary, Sazia Mehnaz, Rezwana Sultana, Sejuti Rahman, Sayma Sultana Chowdhury, Sabbir Ahmed Chowdhury, Farig Sadeque, Asif Sushmit

    Abstract: The absence of annotated sign language datasets has hindered the development of sign language recognition and translation technologies. In this paper, we introduce Bornil; a crowdsource-friendly, multilingual sign language data collection, annotation, and validation platform. Bornil allows users to record sign language gestures and lets annotators perform sentence and gloss-level annotation. It al… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

    Comments: 6 pages, 7 figures

  10. arXiv:2308.04945  [pdf, other

    cs.CL cs.AI

    LLMeBench: A Flexible Framework for Accelerating LLMs Benchmarking

    Authors: Fahim Dalvi, Maram Hasanain, Sabri Boughorbel, Basel Mousi, Samir Abdaljalil, Nizi Nazar, Ahmed Abdelali, Shammur Absar Chowdhury, Hamdy Mubarak, Ahmed Ali, Majd Hawasly, Nadir Durrani, Firoj Alam

    Abstract: The recent development and success of Large Language Models (LLMs) necessitate an evaluation of their performance across diverse NLP tasks in different languages. Although several frameworks have been developed and made publicly available, their customization capabilities for specific tasks and datasets are often complex for different users. In this study, we introduce the LLMeBench framework, whi… ▽ More

    Submitted 26 February, 2024; v1 submitted 9 August, 2023; originally announced August 2023.

    Comments: Accepted as a demo paper at EACL 2024

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

  11. arXiv:2308.02503  [pdf, other

    eess.AS cs.CL cs.SD

    MyVoice: Arabic Speech Resource Collaboration Platform

    Authors: Yousseif Elshahawy, Yassine El Kheir, Shammur Absar Chowdhury, Ahmed Ali

    Abstract: We introduce MyVoice, a crowdsourcing platform designed to collect Arabic speech to enhance dialectal speech technologies. This platform offers an opportunity to design large dialectal speech datasets; and makes them publicly available. MyVoice allows contributors to select city/country-level fine-grained dialect and record the displayed utterances. Users can switch roles between contributors and… ▽ More

    Submitted 23 July, 2023; originally announced August 2023.

    Comments: 2 pages, accepted at InterSpeech23 Show and Tell Session

  12. arXiv:2308.01978  [pdf, other

    cs.SE

    Replicability Study: Corpora For Understanding Simulink Models & Projects

    Authors: Sohil Lal Shrestha, Shafiul Azam Chowdhury, Christoph Csallner

    Abstract: Background: Empirical studies on widely used model-based development tools such as MATLAB/Simulink are limited despite the tools' importance in various industries. Aims: The aim of this paper is to investigate the reproducibility of previous empirical studies that used Simulink model corpora and to evaluate the generalizability of their results to a newer and larger corpus, including a compariso… ▽ More

    Submitted 9 August, 2023; v1 submitted 3 August, 2023; originally announced August 2023.

    Comments: Changed A4 paper to letter paper size

  13. arXiv:2306.01845  [pdf, other

    cs.SD eess.AS

    Multi-View Multi-Task Representation Learning for Mispronunciation Detection

    Authors: Yassine El Kheir, Shammur Absar Chowdhury, Ahmed Ali

    Abstract: The disparity in phonology between learner's native (L1) and target (L2) language poses a significant challenge for mispronunciation detection and diagnosis (MDD) systems. This challenge is further intensified by lack of annotated L2 data. This paper proposes a novel MDD architecture that exploits multiple `views' of the same input data assisted by auxiliary tasks to learn more distinctive phoneti… ▽ More

    Submitted 7 August, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: 5 pages, Accepted SLaTE23

  14. arXiv:2305.14982  [pdf, other

    cs.CL cs.AI

    LAraBench: Benchmarking Arabic AI with Large Language Models

    Authors: Ahmed Abdelali, Hamdy Mubarak, Shammur Absar Chowdhury, Maram Hasanain, Basel Mousi, Sabri Boughorbel, Yassine El Kheir, Daniel Izham, Fahim Dalvi, Majd Hawasly, Nizi Nazar, Yousseif Elshahawy, Ahmed Ali, Nadir Durrani, Natasa Milic-Frayling, Firoj Alam

    Abstract: Recent advancements in Large Language Models (LLMs) have significantly influenced the landscape of language and speech research. Despite this progress, these models lack specific benchmarking against state-of-the-art (SOTA) models tailored to particular languages and tasks. LAraBench addresses this gap for Arabic Natural Language Processing (NLP) and Speech Processing tasks, including sequence tag… ▽ More

    Submitted 5 February, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Foundation Models, Large Language Models, Arabic NLP, Arabic Speech, Arabic AI, GPT3.5 Evaluation, USM Evaluation, Whisper Evaluation, GPT-4, BLOOMZ, Jais13b

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

  15. arXiv:2305.07790  [pdf

    cond-mat.mtrl-sci cs.CV eess.IV

    Automated Grain Boundary (GB) Segmentation and Microstructural Analysis in 347H Stainless Steel Using Deep Learning and Multimodal Microscopy

    Authors: Shoieb Ahmed Chowdhury, M. F. N. Taufique, **g Wang, Marissa Masden, Madison Wenzlick, Ram Devanathan, Alan L Schemer-Kohrn, Keerti S Kappagantula

    Abstract: Austenitic 347H stainless steel offers superior mechanical properties and corrosion resistance required for extreme operating conditions such as high temperature. The change in microstructure due to composition and process variations is expected to impact material properties. Identifying microstructural features such as grain boundaries thus becomes an important task in the process-microstructure-… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

  16. arXiv:2305.07445  [pdf, other

    eess.AS cs.CL cs.SD

    QVoice: Arabic Speech Pronunciation Learning Application

    Authors: Yassine El Kheir, Fouad Khnaisser, Shammur Absar Chowdhury, Hamdy Mubarak, Shazia Afzal, Ahmed Ali

    Abstract: This paper introduces a novel Arabic pronunciation learning application QVoice, powered with end-to-end mispronunciation detection and feedback generator module. The application is designed to support non-native Arabic speakers in enhancing their pronunciation skills, while also hel** native speakers mitigate any potential influence from regional dialects on their Modern Standard Arabic (MSA) pr… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: 2 pages, Accepted InterSpeech23 Show & Tell Demo Session

    Journal ref: InterSpeech 2023

  17. arXiv:2304.00649  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Multilingual Word Error Rate Estimation: e-WER3

    Authors: Shammur Absar Chowdhury, Ahmed Ali

    Abstract: The success of the multilingual automatic speech recognition systems empowered many voice-driven applications. However, measuring the performance of such systems remains a major challenge, due to its dependency on manually transcribed speech data in both mono- and multilingual scenarios. In this paper, we propose a novel multilingual framework -- eWER3 -- jointly trained on acoustic and lexical re… ▽ More

    Submitted 2 April, 2023; originally announced April 2023.

    Comments: Accepted in ICASSP, Multilingual WER estimation, End-to-End systems, multilingual model, automatic word error rate estimation

  18. arXiv:2211.00923  [pdf, other

    cs.SD cs.CL eess.AS

    SpeechBlender: Speech Augmentation Framework for Mispronunciation Data Generation

    Authors: Yassine El Kheir, Shammur Absar Chowdhury, Ahmed Ali, Hamdy Mubarak, Shazia Afzal

    Abstract: The lack of labeled second language (L2) speech data is a major challenge in designing mispronunciation detection models. We introduce SpeechBlender - a fine-grained data augmentation pipeline for generating mispronunciation errors to overcome such data scarcity. The SpeechBlender utilizes varieties of masks to target different regions of phonetic units, and use the mixing factors to linearly inte… ▽ More

    Submitted 12 July, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: 5 pages

  19. arXiv:2206.08835  [pdf, other

    cs.CL cs.SD eess.AS

    What can Speech and Language Tell us About the Working Alliance in Psychotherapy

    Authors: Sebastian P. Bayerl, Gabriel Roccabruna, Shammur Absar Chowdhury, Tommaso Ciulli, Morena Danieli, Korbinian Riedhammer, Giuseppe Riccardi

    Abstract: We are interested in the problem of conversational analysis and its application to the health domain. Cognitive Behavioral Therapy is a structured approach in psychotherapy, allowing the therapist to help the patient to identify and modify the malicious thoughts, behavior, or actions. This cooperative effort can be evaluated using the Working Alliance Inventory Observer-rated Shortened - a 12 item… ▽ More

    Submitted 27 June, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: Accepted at Interspeech 2022

  20. arXiv:2205.01842  [pdf, other

    cs.SE

    An Empirical Study on Maintainable Method Size in Java

    Authors: Shaiful Alam Chowdhury, Gias Uddin, Reid Holmes

    Abstract: Code metrics have been widely used to estimate software maintenance effort. Metrics have generally been used to guide developer effort to reduce or avoid future maintenance burdens. Size is the simplest and most widely deployed metric. The size metric is pervasive because size correlates with many other common metrics (e.g., McCabe complexity, readability, etc.). Given the ease of computing a meth… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

  21. SLNET: A Redistributable Corpus of 3rd-party Simulink Models

    Authors: Sohil Lal Shrestha, Shafiul Azam Chowdhury, Christoph Csallner

    Abstract: MATLAB/Simulink is widely used for model-based design. Engineers create Simulink models and compile them to embedded code, often to control safety-critical cyber-physical systems in automotive, aerospace, and healthcare applications. Despite Simulink's importance, there are few large-scale empirical Simulink studies, perhaps because there is no large readily available corpus of third-party open-so… ▽ More

    Submitted 31 March, 2022; originally announced March 2022.

    Comments: Published in Mining Software Repositories 2022 - Data and Tool Showcase Track

    MSC Class: 68-11 ACM Class: D.2.13; E.0

  22. arXiv:2203.00271  [pdf, other

    cs.CL cs.CY cs.SI

    ArabGend: Gender Analysis and Inference on Arabic Twitter

    Authors: Hamdy Mubarak, Shammur Absar Chowdhury, Firoj Alam

    Abstract: Gender analysis of Twitter can reveal important socio-cultural differences between male and female users. There has been a significant effort to analyze and automatically infer gender in the past for most widely spoken languages' content, however, to our knowledge very limited work has been done for Arabic. In this paper, we perform an extensive analysis of differences between male and female user… ▽ More

    Submitted 1 March, 2022; originally announced March 2022.

    Comments: Gender Analysis Dataset, Demography, Arabic Twitter Accounts, Arabic Social Media Content

    MSC Class: 68T50 ACM Class: I.2.7

  23. arXiv:2201.06723  [pdf, other

    cs.CL

    Emojis as Anchors to Detect Arabic Offensive Language and Hate Speech

    Authors: Hamdy Mubarak, Sabit Hassan, Shammur Absar Chowdhury

    Abstract: We introduce a generic, language-independent method to collect a large percentage of offensive and hate tweets regardless of their topics or genres. We harness the extralinguistic information embedded in the emojis to collect a large number of offensive tweets. We apply the proposed method on Arabic tweets and compare it with English tweets - analysing key cultural differences. We observed a const… ▽ More

    Submitted 18 May, 2022; v1 submitted 17 January, 2022; originally announced January 2022.

  24. arXiv:2201.06496  [pdf, other

    cs.CL cs.SI

    ArCovidVac: Analyzing Arabic Tweets About COVID-19 Vaccination

    Authors: Hamdy Mubarak, Sabit Hassan, Shammur Absar Chowdhury, Firoj Alam

    Abstract: The emergence of the COVID-19 pandemic and the first global infodemic have changed our lives in many different ways. We relied on social media to get the latest information about the COVID-19 pandemic and at the same time to disseminate information. The content in social media consisted not only health related advises, plans, and informative news from policy makers, but also contains conspiracies… ▽ More

    Submitted 17 January, 2022; originally announced January 2022.

    Comments: 8 pages, 9 figures

  25. arXiv:2201.02550  [pdf, other

    cs.CL cs.SD eess.AS

    Textual Data Augmentation for Arabic-English Code-Switching Speech Recognition

    Authors: Amir Hussein, Shammur Absar Chowdhury, Ahmed Abdelali, Najim Dehak, Ahmed Ali, Sanjeev Khudanpur

    Abstract: The pervasiveness of intra-utterance code-switching (CS) in spoken content requires that speech recognition (ASR) systems handle mixed language. Designing a CS-ASR system has many challenges, mainly due to data scarcity, grammatical structure complexity, and domain mismatch. The most common method for addressing CS is to train an ASR system with the available transcribed CS speech, along with mono… ▽ More

    Submitted 11 January, 2023; v1 submitted 7 January, 2022; originally announced January 2022.

  26. arXiv:2107.03844  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    A Review of Bangla Natural Language Processing Tasks and the Utility of Transformer Models

    Authors: Firoj Alam, Arid Hasan, Tanvirul Alam, Akib Khan, Janntatul Tajrin, Naira Khan, Shammur Absar Chowdhury

    Abstract: Bangla -- ranked as the 6th most widely spoken language across the world (https://www.ethnologue.com/guides/ethnologue200), with 230 million native speakers -- is still considered as a low-resource language in the natural language processing (NLP) community. With three decades of research, Bangla NLP (BNLP) is still lagging behind mainly due to the scarcity of resources and the challenges that com… ▽ More

    Submitted 25 July, 2021; v1 submitted 8 July, 2021; originally announced July 2021.

    Comments: Under Review, Bangla language processing, text classification, sequence tagging, datasets, benchmarks, transformer models

    MSC Class: 68T50 ACM Class: I.2.7

  27. arXiv:2107.00439  [pdf, other

    cs.CL cs.SD eess.AS

    What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis

    Authors: Shammur Absar Chowdhury, Nadir Durrani, Ahmed Ali

    Abstract: Deep neural networks are inherently opaque and challenging to interpret. Unlike hand-crafted feature-based models, we struggle to comprehend the concepts learned and how they interact within these models. This understanding is crucial not only for debugging purposes but also for ensuring fairness in ethical decision-making. In our study, we conduct a post-hoc functional interpretability analysis o… ▽ More

    Submitted 10 July, 2023; v1 submitted 1 July, 2021; originally announced July 2021.

    Comments: Accepted in CSL journal. Keywords: Speech, Neuron Analysis, Interpretibility, Diagnostic Classifier, AI explainability, End-to-End Architecture

  28. arXiv:2106.13000  [pdf, other

    cs.CL cs.SD eess.AS

    QASR: QCRI Aljazeera Speech Resource -- A Large Scale Annotated Arabic Speech Corpus

    Authors: Hamdy Mubarak, Amir Hussein, Shammur Absar Chowdhury, Ahmed Ali

    Abstract: We introduce the largest transcribed Arabic speech corpus, QASR, collected from the broadcast domain. This multi-dialect speech dataset contains 2,000 hours of speech sampled at 16kHz crawled from Aljazeera news channel. The dataset is released with lightly supervised transcriptions, aligned with the audio segments. Unlike previous datasets, QASR contains linguistically motivated segmentation, pun… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

    Comments: Speech Corpus, Spoken Conversation, ASR, Dialect Identification, Punctuation Restoration, Speaker Verification, NER, Named Entity, Arabic, Speaker gender, Turn-taking Accepted in ACL 2021

  29. arXiv:2105.14779  [pdf, other

    cs.CL cs.HC cs.SD eess.AS

    Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR

    Authors: Shammur Absar Chowdhury, Amir Hussein, Ahmed Abdelali, Ahmed Ali

    Abstract: With the advent of globalization, there is an increasing demand for multilingual automatic speech recognition (ASR), handling language and dialectal variation of spoken content. Recent studies show its efficacy over monolingual systems. In this study, we design a large multilingual end-to-end ASR using self-attention based conformer architecture. We trained the system using Arabic (Ar), English (E… ▽ More

    Submitted 5 July, 2021; v1 submitted 31 May, 2021; originally announced May 2021.

    Comments: Accepted in INTERSPEECH 2021, Multilingual ASR, Multi-dialectal ASR, Code-Switching ASR, Arabic ASR, Conformer, Transformer, E2E ASR, Speech Recognition, ASR, Arabic, English, French

  30. arXiv:2011.10106  [pdf

    cs.CL cs.IR cs.LG

    Sentiment Classification in Bangla Textual Content: A Comparative Study

    Authors: Md. Arid Hasan, Jannatul Tajrin, Shammur Absar Chowdhury, Firoj Alam

    Abstract: Sentiment analysis has been widely used to understand our views on social and political agendas or user experiences over a product. It is one of the cores and well-researched areas in NLP. However, for low-resource languages, like Bangla, one of the prominent challenge is the lack of resources. Another important limitation, in the current literature for Bangla, is the absence of comparable results… ▽ More

    Submitted 19 November, 2020; originally announced November 2020.

    Comments: Accepted at ICCIT-2020

    MSC Class: 68T50 ACM Class: I.2.7

  31. Discovering Users Topic of Interest from Tweet

    Authors: Muhammad Kamal Hossen, Md. Ali Faiad, Md. Shahnur Azad Chowdhury, Md. Sajjatul Islam

    Abstract: Nowadays social media has become one of the largest gatherings of people in online. There are many ways for the industries to promote their products to the public through advertising. The variety of advertisement is increasing dramatically. Businessmen are so much dependent on the advertisement that significantly it really brought out success in the market and hence practiced by major industries.… ▽ More

    Submitted 10 March, 2018; originally announced March 2018.

    Journal ref: International Journal of Computer Science & Information Technology (IJCSIT) Vol 10, No 1, February 2018

  32. arXiv:1711.06095  [pdf, other

    cs.CV cs.CL

    Depression Severity Estimation from Multiple Modalities

    Authors: Evgeny Stepanov, Stephane Lathuiliere, Shammur Absar Chowdhury, Arindam Ghosh, Radu-Laurentiu Vieriu, Nicu Sebe, Giuseppe Riccardi

    Abstract: Depression is a major debilitating disorder which can affect people from all ages. With a continuous increase in the number of annual cases of depression, there is a need to develop automatic techniques for the detection of the presence and extent of depression. In this AVEC challenge we explore different modalities (speech, language and visual features extracted from face) to design and develop a… ▽ More

    Submitted 10 November, 2017; originally announced November 2017.

    Comments: 8 pages, 1 figure