Skip to main content

Showing 1–50 of 85 results for author: Rana, R

.
  1. arXiv:2404.05049  [pdf, other

    cs.CV

    PlateSegFL: A Privacy-Preserving License Plate Detection Using Federated Segmentation Learning

    Authors: Md. Shahriar Rahman Anuvab, Mishkat Sultana, Md. Atif Hossain, Shashwata Das, Suvarthi Chowdhury, Rafeed Rahman, Dibyo Fabian Dofadar, Shahriar Rahman Rana

    Abstract: Automatic License Plate Recognition (ALPR) is an integral component of an intelligent transport system with extensive applications in secure transportation, vehicle-to-vehicle communication, stolen vehicles detection, traffic violations, and traffic flow management. The existing license plate detection system focuses on one-shot learners or pre-trained models that operate with a geometric bounding… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  2. arXiv:2403.14318  [pdf

    cs.CV

    A Lightweight Attention-based Deep Network via Multi-Scale Feature Fusion for Multi-View Facial Expression Recognition

    Authors: Ali Ezati, Mohammadreza Dezyani, Rajib Rana, Roozbeh Rajabi, Ahmad Ayatollahi

    Abstract: Convolutional neural networks (CNNs) and their variations have shown effectiveness in facial expression recognition (FER). However, they face challenges when dealing with high computational complexity and multi-view head poses in real-world scenarios. We introduce a lightweight attentional network incorporating multi-scale feature fusion (LANMSFF) to tackle these issues. For the first challenge, w… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: 9 pages, two-column, submitted to journal

  3. arXiv:2403.14083  [pdf, other

    cs.SD cs.LG eess.AS

    emoDARTS: Joint Optimisation of CNN & Sequential Neural Network Architectures for Superior Speech Emotion Recognition

    Authors: Thejan Rajapakshe, Rajib Rana, Sara Khalifa, Berrak Sisman, Bjorn W. Schuller, Carlos Busso

    Abstract: Speech Emotion Recognition (SER) is crucial for enabling computers to understand the emotions conveyed in human communication. With recent advancements in Deep Learning (DL), the performance of SER models has significantly improved. However, designing an optimal DL architecture requires specialised knowledge and experimental assessments. Fortunately, Neural Architecture Search (NAS) provides a pot… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Submitted to IEEE Transactions on Affective Computing on February 19, 2024. arXiv admin note: text overlap with arXiv:2305.14402

  4. arXiv:2403.13230  [pdf, other

    cs.NI

    BFT-PoLoc: A Byzantine Fortified Trigonometric Proof of Location Protocol using Internet Delays

    Authors: Peiyao Sheng, Vishal Sevani, Ranvir Rana, Himanshu Tyagi, Pramod Viswanath

    Abstract: Internet platforms depend on accurately determining the geographical locations of online users to deliver targeted services (e.g., advertising). The advent of decentralized platforms (blockchains) emphasizes the importance of geographically distributed nodes, making the validation of locations more crucial. In these decentralized settings, mutually non-trusting participants need to {\em prove} the… ▽ More

    Submitted 28 March, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

  5. arXiv:2402.07241  [pdf, other

    cs.CR

    Proof of Diligence: Cryptoeconomic Security for Rollups

    Authors: Peiyao Sheng, Ranvir Rana, Himanshu Tyagi, Pramod Viswanath

    Abstract: Layer 1 (L1) blockchains such as Ethereum are secured under an "honest supermajority of stake" assumption for a large pool of validators who verify each and every transaction on it. This high security comes at a scalability cost which not only effects the throughput of the blockchain but also results in high gas fees for executing transactions on chain. The most successful solution for this proble… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

  6. arXiv:2310.09413  [pdf, other

    cs.LG cs.GT

    ZeroSwap: Data-driven Optimal Market Making in DeFi

    Authors: Viraj Nadkarni, Jiachen Hu, Ranvir Rana, Chi **, Sanjeev Kulkarni, Pramod Viswanath

    Abstract: Automated Market Makers (AMMs) are major centers of matching liquidity supply and demand in Decentralized Finance. Their functioning relies primarily on the presence of liquidity providers (LPs) incentivized to invest their assets into a liquidity pool. However, the prices at which a pooled asset is traded is often more stale than the prices on centralized and more liquid exchanges. This leads to… ▽ More

    Submitted 29 April, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

  7. arXiv:2310.09053  [pdf, other

    cs.RO cs.AI eess.SY

    DATT: Deep Adaptive Trajectory Tracking for Quadrotor Control

    Authors: Kevin Huang, Rwik Rana, Alexander Spitzer, Guanya Shi, Byron Boots

    Abstract: Precise arbitrary trajectory tracking for quadrotors is challenging due to unknown nonlinear dynamics, trajectory infeasibility, and actuation limits. To tackle these challenges, we present Deep Adaptive Trajectory Tracking (DATT), a learning-based approach that can precisely track arbitrary, potentially infeasible trajectories in the presence of large disturbances in the real world. DATT builds o… ▽ More

    Submitted 13 December, 2023; v1 submitted 13 October, 2023; originally announced October 2023.

  8. arXiv:2310.04703  [pdf, other

    cs.CL cs.HC cs.LG

    Integrating Contrastive Learning into a Multitask Transformer Model for Effective Domain Adaptation

    Authors: Chung-Soo Ahn, Jagath C. Rajapakse, Rajib Rana

    Abstract: While speech emotion recognition (SER) research has made significant progress, achieving generalization across various corpora continues to pose a problem. We propose a novel domain adaptation technique that embodies a multitask framework with SER as the primary task, and contrastive learning and information maximisation loss as auxiliary tasks, underpinned by fine-tuning of transformers pre-train… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

    MSC Class: Speech Emotion Recognition; Domain adaptation

  9. arXiv:2310.04590  [pdf, other

    cs.RO cs.LG

    Deep Model Predictive Optimization

    Authors: Jacob Sacks, Rwik Rana, Kevin Huang, Alex Spitzer, Guanya Shi, Byron Boots

    Abstract: A major challenge in robotics is to design robust policies which enable complex and agile behaviors in the real world. On one end of the spectrum, we have model-free reinforcement learning (MFRL), which is incredibly flexible and general but often results in brittle policies. In contrast, model predictive control (MPC) continually re-plans at each time step to remain robust to perturbations and mo… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Comments: Main paper is 6 pages with 4 figures and 1 table. Code available at: https://github.com/jisacks/dmpo

  10. arXiv:2309.00965  [pdf, other

    hep-th cond-mat.stat-mech math-ph quant-ph

    Inequivalent $Z_2^n$-graded brackets, $n$-bit parastatistics and statistical transmutations of supersymmetric quantum mechanics

    Authors: M. M. Balbino, I. P. de Freitas, R. G. Rana, F. Toppan

    Abstract: Given an associative ring of $Z_2^n$-graded operators, the number of inequivalent brackets of Lie-type which are compatible with the grading and satisfy graded Jacobi identities is $b_n= n+\lfloor n/2\rfloor+1$. This follows from the Rittenberg-Wyler and Scheunert analysis of "color" Lie (super)algebras which is revisited here in terms of Boolean logic gates. The inequivalent brackets, recovered f… ▽ More

    Submitted 2 September, 2023; originally announced September 2023.

    Comments: 57 pages, 16 figures

    Report number: CBPF-NF-002/23

  11. arXiv:2308.08084  [pdf

    physics.chem-ph

    Architecture Optimization Dramatically Improves Reverse Bias Stability in Perovskite Solar Cells: A Role of Polymer Hole Transport Layers

    Authors: Fangyuan Jiang, Yangwei Shi, Tanka R. Rana, Daniel Morales, Isaac Gould, Declan P. McCarthy, Joel Smith, Grey Christoforo, Hannah Contreras, Stephen Barlow, Aditya D. Mohite, Henry Snaith, Seth R. Marder, J. Devin MacKenzie, Michael D. McGehee, David S. Ginger

    Abstract: We report that device architecture engineering has a substantial impact on the reverse bias instability that has been reported as a critical issue in commercializing perovskite solar cells. We demonstrate breakdown voltages exceeding -15 V in typical pin structured perovskite solar cells via two steps: i) using polymer hole transporting materials; ii) using a more electrochemically stable gold ele… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

  12. arXiv:2307.16562  [pdf, other

    cs.CR

    SAKSHI: Decentralized AI Platforms

    Authors: Suma Bhat, Canhui Chen, Zerui Cheng, Zhixuan Fang, Ashwin Hebbar, Sreeram Kannan, Ranvir Rana, Peiyao Sheng, Himanshu Tyagi, Pramod Viswanath, Xuechao Wang

    Abstract: Large AI models (e.g., Dall-E, GPT4) have electrified the scientific, technological and societal landscape through their superhuman capabilities. These services are offered largely in a traditional web2.0 format (e.g., OpenAI's GPT4 service). As more large AI models proliferate (personalizing and specializing to a variety of domains), there is a tremendous need to have a neutral trust-free platfor… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Comments: 23 pages, 9 figures

  13. arXiv:2306.12834  [pdf, other

    cs.CL cs.CY

    Natural Language Processing in Electronic Health Records in Relation to Healthcare Decision-making: A Systematic Review

    Authors: Elias Hossain, Rajib Rana, Niall Higgins, Jeffrey Soar, Prabal Datta Barua, Anthony R. Pisani, Ph. D, Kathryn Turner}

    Abstract: Background: Natural Language Processing (NLP) is widely used to extract clinical insights from Electronic Health Records (EHRs). However, the lack of annotated data, automated tools, and other challenges hinder the full utilisation of NLP for EHRs. Various Machine Learning (ML), Deep Learning (DL) and NLP techniques are studied and compared to understand the limitations and opportunities in this s… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

  14. arXiv:2305.14402  [pdf, other

    cs.SD cs.LG eess.AS

    Enhancing Speech Emotion Recognition Through Differentiable Architecture Search

    Authors: Thejan Rajapakshe, Rajib Rana, Sara Khalifa, Berrak Sisman, Björn Schuller

    Abstract: Speech Emotion Recognition (SER) is a critical enabler of emotion-aware communication in human-computer interactions. Recent advancements in Deep Learning (DL) have substantially enhanced the performance of SER models through increased model complexity. However, designing optimal DL architectures requires prior experience and experimental evaluations. Encouragingly, Neural Architecture Search (NAS… ▽ More

    Submitted 18 January, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: 5 pages, 4 figures

  15. arXiv:2211.07290  [pdf, other

    cs.HC

    AI-Based Emotion Recognition: Promise, Peril, and Prescriptions for Prosocial Path

    Authors: Siddique Latif, Hafiz Shehbaz Ali, Muhammad Usama, Rajib Rana, Björn Schuller, Junaid Qadir

    Abstract: Automated emotion recognition (AER) technology can detect humans' emotional states in real-time using facial expressions, voice attributes, text, body movements, and neurological signals and has a broad range of applications across many sectors. It helps businesses get a much deeper understanding of their customers, enables monitoring of individuals' moods in healthcare, education, or the automoti… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: Under review in IEEE TAC

  16. arXiv:2210.02627  [pdf, other

    cs.CL cs.IR

    Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering

    Authors: Shamane Siriwardhana, Rivindu Weerasekera, Elliott Wen, Tharindu Kaluarachchi, Rajib Rana, Suranga Nanayakkara

    Abstract: Retrieval Augment Generation (RAG) is a recent advancement in Open-Domain Question Answering (ODQA). RAG has only been trained and explored with a Wikipedia-based external knowledge base and is not optimized for use in other specialized domains such as healthcare and news. In this paper, we evaluate the impact of joint training of the retriever and generator components of RAG for the task of domai… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: This paper is awaiting publication at Transactions of the Association for Computational Linguistics. This is a pre-MIT Press publication version. For associated huggingface transformers code, see https://github.com/huggingface/transformers/tree/main/examples/research_projects/rag-end2end-retriever

  17. Optimal Bootstrap** of PoW Blockchains

    Authors: Ranvir Rana, Dimitris Karakostas, Sreeram Kannan, Aggelos Kiayias, Pramod Viswanath

    Abstract: Proof of Work (PoW) blockchains are susceptible to adversarial majority mining attacks in the early stages due to incipient participation and corresponding low net hash power. Bootstrap** ensures safety and liveness during the transient stage by protecting against a majority mining attack, allowing a PoW chain to grow the participation base and corresponding mining hash power. Liveness is especi… ▽ More

    Submitted 22 August, 2022; originally announced August 2022.

  18. arXiv:2208.05890  [pdf, other

    cs.CL cs.AI

    Speech Synthesis with Mixed Emotions

    Authors: Kun Zhou, Berrak Sisman, Rajib Rana, B. W. Schuller, Haizhou Li

    Abstract: Emotional speech synthesis aims to synthesize human voices with various emotional effects. The current studies are mostly focused on imitating an averaged style belonging to a specific emotion type. In this paper, we seek to generate speech with a mixture of emotions at run-time. We propose a novel formulation that measures the relative difference between the speech samples of different emotions.… ▽ More

    Submitted 28 December, 2022; v1 submitted 11 August, 2022; originally announced August 2022.

    Comments: Accepted to IEEE Transactions on Affective Computing

  19. arXiv:2207.12248  [pdf, other

    cs.SD cs.LG eess.AS

    Domain Adapting Deep Reinforcement Learning for Real-world Speech Emotion Recognition

    Authors: Thejan Rajapakshe, Rajib Rana, Sara Khalifa, Bjorn W. Schuller

    Abstract: Computers can understand and then engage with people in an emotionally intelligent way thanks to speech-emotion recognition (SER). However, the performance of SER in cross-corpus and real-world live data feed scenarios can be significantly improved. The inability to adapt an existing model to a new domain is one of the shortcomings of SER methods. To address this challenge, researchers have develo… ▽ More

    Submitted 23 September, 2022; v1 submitted 6 July, 2022; originally announced July 2022.

  20. arXiv:2207.05298  [pdf, other

    cs.SD eess.AS

    Multitask Learning from Augmented Auxiliary Data for Improving Speech Emotion Recognition

    Authors: Siddique Latif, Rajib Rana, Sara Khalifa, Raja Jurdak, Björn W. Schuller

    Abstract: Despite the recent progress in speech emotion recognition (SER), state-of-the-art systems lack generalisation across different conditions. A key underlying reason for poor generalisation is the scarcity of emotion datasets, which is a significant roadblock to designing robust machine learning (ML) models. Recent works in SER focus on utilising multitask learning (MTL) methods to improve generalisa… ▽ More

    Submitted 12 July, 2022; originally announced July 2022.

    Comments: Under review IEEE Transactions on Affective Computing

  21. arXiv:2204.08625  [pdf, other

    cs.SD eess.AS

    Self Supervised Adversarial Domain Adaptation for Cross-Corpus and Cross-Language Speech Emotion Recognition

    Authors: Siddique Latif, Rajib Rana, Sara Khalifa, Raja Jurdak, Björn Schuller

    Abstract: Despite the recent advancement in speech emotion recognition (SER) within a single corpus setting, the performance of these SER systems degrades significantly for cross-corpus and cross-language scenarios. The key reason is the lack of generalisation in SER systems towards unseen conditions, which causes them to perform poorly in cross-corpus and cross-language settings. Recent studies focus on ut… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

    Comments: Accepted in IEEE Transactions on Affective Computing

  22. arXiv:2201.03967  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    Emotion Intensity and its Control for Emotional Voice Conversion

    Authors: Kun Zhou, Berrak Sisman, Rajib Rana, Björn W. Schuller, Haizhou Li

    Abstract: Emotional voice conversion (EVC) seeks to convert the emotional state of an utterance while preserving the linguistic content and speaker identity. In EVC, emotions are usually treated as discrete categories overlooking the fact that speech also conveys emotions with various intensity levels that the listener can perceive. In this paper, we aim to explicitly characterize and control the intensity… ▽ More

    Submitted 18 July, 2022; v1 submitted 9 January, 2022; originally announced January 2022.

    Comments: Accepted by IEEE Transactions on Affective Computing

  23. arXiv:2111.15343  [pdf, other

    cs.RO

    Fast and Real-time End to End Control in Autonomous Racing Cars Through Representation Learning

    Authors: Praveen Venkatesh, Rwik Rana, Harish PM

    Abstract: The challenges presented in an autonomous racing situation are distinct from those faced in regular autonomous driving and require faster end-to-end algorithms and consideration of a longer horizon in determining optimal current actions kee** in mind upcoming maneuvers and situations. In this paper, we propose an end-to-end method for autonomous racing that takes in as inputs video information f… ▽ More

    Submitted 30 November, 2021; originally announced November 2021.

  24. arXiv:2106.14184  [pdf, other

    cs.CV

    Memory Guided Road Detection

    Authors: Praveen Venkatesh, Rwik Rana, Varun Jain

    Abstract: In self driving car applications, there is a requirement to predict the location of the lane given an input RGB front facing image. In this paper, we propose an architecture that allows us to increase the speed and robustness of road detection without a large hit in accuracy by introducing an underlying shared feature space that is propagated over time, which serves as a flowing dynamic memory. By… ▽ More

    Submitted 27 June, 2021; originally announced June 2021.

  25. The Lorentz-violating real scalar field at thermal equilibrium

    Authors: A. R. Aguirre, G. Flores-Hidalgo, R. G. Rana, E. S. Souza

    Abstract: In this paper we study Lorentz-Violation(LV) effects on the thermodynamics properties of a real scalar field theory due to the presence of a constant background tensor field. In particular, we analyse and compute explicitly the deviations of the internal energy, pressure, and entropy of the system at thermal equilibrium due to the LV contributions. For the free massless scalar field we obtain exac… ▽ More

    Submitted 15 March, 2021; originally announced March 2021.

    Comments: 26 pages, 1 figure

    Journal ref: Eur. Phys. J. C (2021) 81:459

  26. arXiv:2102.03242  [pdf

    physics.optics

    Optical Kerr nonlinearity and multi-photon absorption of DSTMS measured by Z-scan method

    Authors: Jiang Li, Rakesh Rana, Liguo Zhu, Cangli Liu, Harald Schneider, Alexej Pashkin

    Abstract: We investigate the optical Kerr nonlinearity and multi-photon absorption (MPA) properties of DSTMS excited by femtosecond pulses at a wavelengths of 1.43 μm, which is optimal for terahertz generation via difference frequency mixing. The MPA and the optical Kerr coefficients of DSTMS at 1.43 μm are strongly anisotropic indicating a dominating contribution from cascaded 2nd-order nonlinearity. These… ▽ More

    Submitted 25 August, 2021; v1 submitted 5 February, 2021; originally announced February 2021.

    Comments: 5pages, 4 figures

  27. arXiv:2101.00738  [pdf, other

    cs.SD cs.LG eess.AS

    A novel policy for pre-trained Deep Reinforcement Learning for Speech Emotion Recognition

    Authors: Thejan Rajapakshe, Rajib Rana, Sara Khalifa, Björn W. Schuller, Jiajun Liu

    Abstract: Reinforcement Learning (RL) is a semi-supervised learning paradigm which an agent learns by interacting with an environment. Deep learning in combination with RL provides an efficient method to learn how to interact with the environment is called Deep Reinforcement Learning (deep RL). Deep RL has gained tremendous success in gaming - such as AlphaGo, but its potential have rarely being explored fo… ▽ More

    Submitted 31 January, 2021; v1 submitted 3 January, 2021; originally announced January 2021.

  28. arXiv:2006.00877  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    High-Fidelity Audio Generation and Representation Learning with Guided Adversarial Autoencoder

    Authors: Kazi Nazmul Haque, Rajib Rana, Björn W Schuller

    Abstract: Unsupervised disentangled representation learning from the unlabelled audio data, and high fidelity audio generation have become two linchpins in the machine learning research fields. However, the representation learned from an unsupervised setting does not guarantee its' usability for any downstream task at hand, which can be a wastage of the resources, if the training was conducted for that part… ▽ More

    Submitted 17 October, 2020; v1 submitted 1 June, 2020; originally announced June 2020.

    Comments: The paper is submitted to IEEE Access for review

  29. arXiv:2005.11172  [pdf, other

    eess.AS cs.SD

    Deep Reinforcement Learning with Pre-training for Time-efficient Training of Automatic Speech Recognition

    Authors: Thejan Rajapakshe, Siddique Latif, Rajib Rana, Sara Khalifa, Björn W. Schuller

    Abstract: Deep reinforcement learning (deep RL) is a combination of deep learning with reinforcement learning principles to create efficient methods that can learn by interacting with its environment. This has led to breakthroughs in many complex tasks, such as playing the game "Go", that were previously difficult to solve. However, deep RL requires significant training time making it difficult to use in va… ▽ More

    Submitted 21 May, 2020; originally announced May 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:1910.11256

  30. arXiv:2005.09610  [pdf, other

    cs.CR cs.DC cs.GT cs.IT

    Free2Shard: Adaptive-adversary-resistant sharding via Dynamic Self Allocation

    Authors: Ranvir Rana, Sreeram Kannan, David Tse, Pramod Viswanath

    Abstract: Propelled by the growth of large-scale blockchain deployments, much recent progress has been made in designing sharding protocols that achieve throughput scaling linearly in the number of nodes. However, existing protocols are not robust to an adversary adaptively corrupting a fixed fraction of nodes. In this paper, we propose Free2Shard -- a new architecture that achieves near-linear scaling whil… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

  31. arXiv:2005.08453  [pdf, other

    cs.SD eess.AS

    Deep Architecture Enhancing Robustness to Noise, Adversarial Attacks, and Cross-corpus Setting for Speech Emotion Recognition

    Authors: Siddique Latif, Rajib Rana, Sara Khalifa, Raja Jurdak, Björn W. Schuller

    Abstract: Speech emotion recognition systems (SER) can achieve high accuracy when the training and test data are identically distributed, but this assumption is frequently violated in practice and the performance of SER systems plummet against unforeseen data shifts. The design of robust models for accurate SER is challenging, which limits its use in practical applications. In this paper we propose a deeper… ▽ More

    Submitted 25 July, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

    Comments: Accepted in INTERSPEECH 2020

  32. arXiv:2005.08447  [pdf, other

    cs.SD eess.AS

    Augmenting Generative Adversarial Networks for Speech Emotion Recognition

    Authors: Siddique Latif, Muhammad Asim, Rajib Rana, Sara Khalifa, Raja Jurdak, Björn W. Schuller

    Abstract: Generative adversarial networks (GANs) have shown potential in learning emotional attributes and generating new data samples. However, their performance is usually hindered by the unavailability of larger speech emotion recognition (SER) data. In this work, we propose a framework that utilises the mixup data augmentation scheme to augment the GAN in feature learning and generation. To show the eff… ▽ More

    Submitted 25 July, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

    Comments: Accepted in INTERSPEECH 2020

  33. arXiv:2004.00484  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    Nonlinear Charge Transport in InGaAs Nanowires at Terahertz Frequencies

    Authors: Rakesh Rana, Leila Balaghi, Ivan Fotev, Harald Schneider, Manfred Helm, Emmanouil Dimakis, Alexej Pashkin

    Abstract: We probe the electron transport properties in the shell of GaAs/In0.2Ga0.8As core/shell nanowires at high electric fields using optical pump / THz probe spectroscopy with broadband THz pulses and peak electric fields up to 0.6 MV/cm. The plasmon resonance of the photoexcited charge carriers exhibits a systematic redshift and a suppression of its spectral weight for THz driving fields exceeding 0.4… ▽ More

    Submitted 1 April, 2020; originally announced April 2020.

    Comments: includes supporting information

  34. arXiv:2003.02836  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Guided Generative Adversarial Neural Network for Representation Learning and High Fidelity Audio Generation using Fewer Labelled Audio Data

    Authors: Kazi Nazmul Haque, Rajib Rana, John H. L. Hansen, Björn Schuller

    Abstract: Recent improvements in Generative Adversarial Neural Networks (GANs) have shown their ability to generate higher quality samples as well as to learn good representations for transfer learning. Most of the representation learning methods based on GANs learn representations ignoring their post-use scenario, which can lead to increased generalisation ability. However, the model can become redundant i… ▽ More

    Submitted 1 June, 2020; v1 submitted 5 March, 2020; originally announced March 2020.

    Comments: 10 pages

  35. arXiv:2001.00378  [pdf, other

    cs.SD cs.LG eess.AS

    Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends

    Authors: Siddique Latif, Rajib Rana, Sara Khalifa, Raja Jurdak, Junaid Qadir, Björn W. Schuller

    Abstract: Research on speech processing has traditionally considered the task of designing hand-engineered acoustic features (feature engineering) as a separate distinct problem from the task of designing efficient machine learning (ML) models to make prediction and classification decisions. There are two main drawbacks to this approach: firstly, the feature engineering being manual is cumbersome and requir… ▽ More

    Submitted 24 September, 2021; v1 submitted 2 January, 2020; originally announced January 2020.

    Comments: Part of this work is accepted in IEEE Transactions on Affective Computing 2021. https://ieeexplore.ieee.org/document/9543566

  36. Galaxy And Mass Assembly (GAMA): Properties and evolution of red spiral galaxies

    Authors: Smriti Mahajan, Kriti Kamal Gupta, Rahul Rana, M. J. I. Brown, S. Phillipps, Joss Bland-Hawthorn, M. N. Bremer, S. Brough, B. W. Holwerda, A. M. Hopkins, J. Loveday, Kevin Pimbblet, Lingyu Wang

    Abstract: We use multi-wavelength data from the Galaxy and Mass Assembly (GAMA) survey to explore the cause of red optical colours in nearby (0.002<z<0.06) spiral galaxies. We show that the colours of red spiral galaxies are a direct consequence of some environment-related mechanism(s) which has removed dust and gas, leading to a lower star formation rate. We conclude that this process acts on long timescal… ▽ More

    Submitted 25 October, 2019; originally announced October 2019.

    Comments: Accepted for publication in the MNRAS; 11 pages; 12 figures

  37. arXiv:1910.11256  [pdf, other

    cs.SD cs.LG

    Pre-training in Deep Reinforcement Learning for Automatic Speech Recognition

    Authors: Thejan Rajapakshe, Rajib Rana, Siddique Latif, Sara Khalifa, Björn W. Schuller

    Abstract: Deep reinforcement learning (deep RL) is a combination of deep learning with reinforcement learning principles to create efficient methods that can learn by interacting with its environment. This led to breakthroughs in many complex tasks that were previously difficult to solve. However, deep RL requires a large amount of training time that makes it difficult to use in various real-life applicatio… ▽ More

    Submitted 26 October, 2019; v1 submitted 24 October, 2019; originally announced October 2019.

  38. arXiv:1909.08719  [pdf, other

    cs.CR cs.DC cs.IT

    Barracuda: The Power of $\ell$-polling in Proof-of-Stake Blockchains

    Authors: Giulia Fanti, Jiantao Jiao, Ashok Makkuva, Sewoong Oh, Ranvir Rana, Pramod Viswanath

    Abstract: A blockchain is a database of sequential events that is maintained by a distributed group of nodes. A key consensus problem in blockchains is that of determining the next block (data element) in the sequence. Many blockchains address this by electing a new node to propose each new block. The new block is (typically) appended to the tip of the proposer's local blockchain, and subsequently broadcast… ▽ More

    Submitted 18 September, 2019; originally announced September 2019.

    Comments: ACM Mobihoc 2019, Best paper award

  39. arXiv:1907.06078  [pdf, other

    cs.SD eess.AS

    Multi-Task Semi-Supervised Adversarial Autoencoding for Speech Emotion Recognition

    Authors: Siddique Latif, Rajib Rana, Sara Khalifa, Raja Jurdak, Julien Epps, Björn W. Schuller

    Abstract: Inspite the emerging importance of Speech Emotion Recognition (SER), the state-of-the-art accuracy is quite low and needs improvement to make commercial applications of SER viable. A key underlying reason for the low accuracy is the scarcity of emotion datasets, which is a challenge for develo** any robust machine learning model in general. In this paper, we propose a solution to this problem: a… ▽ More

    Submitted 22 March, 2020; v1 submitted 13 July, 2019; originally announced July 2019.

    Comments: Accepted in IEEE Transactions on Affective Computing

  40. arXiv:1904.08613  [pdf, other

    cs.LG cs.CV stat.ML

    Disentangled Representation Learning with Information Maximizing Autoencoder

    Authors: Kazi Nazmul Haque, Siddique Latif, Rajib Rana

    Abstract: Learning disentangled representation from any unlabelled data is a non-trivial problem. In this paper we propose Information Maximising Autoencoder (InfoAE) where the encoder learns powerful disentangled representation through maximizing the mutual information between the representation and given information in an unsupervised fashion. We have evaluated our model on MNIST dataset and achieved 98.9… ▽ More

    Submitted 18 April, 2019; originally announced April 2019.

  41. arXiv:1904.03833  [pdf, other

    cs.SD eess.AS

    Direct Modelling of Speech Emotion from Raw Speech

    Authors: Siddique Latif, Rajib Rana, Sara Khalifa, Raja Jurdak, Julien Epps

    Abstract: Speech emotion recognition is a challenging task and heavily depends on hand-engineered acoustic features, which are typically crafted to echo human perception of speech signals. However, a filter bank that is designed from perceptual evidence is not always guaranteed to be the best in a statistical modelling framework where the end goal is for example emotion classification. This has fuelled the… ▽ More

    Submitted 27 July, 2020; v1 submitted 8 April, 2019; originally announced April 2019.

    Comments: INTERSPEECH 2019

  42. arXiv:1902.09944  [pdf, ps, other

    cs.HC

    Automated Screening for Distress: A Perspective for the Future

    Authors: Rajib Rana, Siddique Latif, Raj Gururajan, Anthony Gray, Geraldine Mackenzie, Gerald Humphris, Jeff Dunn

    Abstract: Distress is a complex condition which affects a significant percentage of cancer patients and may lead to depression, anxiety, sadness, suicide and other forms of psychological morbidity. Compelling evidence supports screening for distress as a means of facilitating early intervention and subsequent improvements in psychological well-being and overall quality of life. Nevertheless, despite the exi… ▽ More

    Submitted 27 July, 2020; v1 submitted 22 February, 2019; originally announced February 2019.

    Comments: Accepted in European Journal of Cancer Care

  43. arXiv:1811.11402  [pdf, other

    cs.LG cs.CR eess.SP stat.ML

    Adversarial Machine Learning And Speech Emotion Recognition: Utilizing Generative Adversarial Networks For Robustness

    Authors: Siddique Latif, Rajib Rana, Junaid Qadir

    Abstract: Deep learning has undoubtedly offered tremendous improvements in the performance of state-of-the-art speech emotion recognition (SER) systems. However, recent research on adversarial examples poses enormous challenges on the robustness of SER systems by showing the susceptibility of deep neural networks to adversarial examples as they rely only on small and imperceptible perturbations. In this stu… ▽ More

    Submitted 30 December, 2018; v1 submitted 28 November, 2018; originally announced November 2018.

  44. arXiv:1811.09750  [pdf, other

    cs.CV

    Automating Motion Correction in Multishot MRI Using Generative Adversarial Networks

    Authors: Siddique Latif, Muhammad Asim, Muhammad Usman, Junaid Qadir, Rajib Rana

    Abstract: Multishot Magnetic Resonance Imaging (MRI) has recently gained popularity as it accelerates the MRI data acquisition process without compromising the quality of final MR image. However, it suffers from motion artifacts caused by patient movements which may lead to misdiagnosis. Modern state-of-the-art motion correction techniques are able to counter small degree motion, however, their adoption is… ▽ More

    Submitted 23 November, 2018; originally announced November 2018.

    Journal ref: MED-NIPS 2018

  45. Non-thermal nature of photo-induced insulator-to-metal transition in NbO$_2$

    Authors: Rakesh Rana, J. Michael Klopf, Jörg Grenzer, Harald Schneider, Manfred Helm, Alexej Pashkin

    Abstract: We study the photo-induced metallization process in niobium dioxide NbO$_2$. This compound undergoes the thermal insulator-to-metal transition at the remarkably high temperature of 1080 K. Our optical pump - terahertz probe measurements reveal the ultrafast switching of the film on a sub-picosecond timescale and the formation of a metastable metallic phase when the incident pump fluence exceeds th… ▽ More

    Submitted 20 September, 2018; v1 submitted 19 September, 2018; originally announced September 2018.

    Journal ref: Phys. Rev. B 99, 041102 (2019)

  46. arXiv:1805.09317  [pdf, other

    stat.ML cs.LG

    Communication Algorithms via Deep Learning

    Authors: Hyeji Kim, Yihan Jiang, Ranvir Rana, Sreeram Kannan, Sewoong Oh, Pramod Viswanath

    Abstract: Coding theory is a central discipline underpinning wireline and wireless modems that are the workhorses of the information age. Progress in coding theory is largely driven by individual human ingenuity with sporadic breakthroughs over the past century. In this paper we study whether it is possible to automate the discovery of decoding algorithms via deep learning. We study a family of sequential c… ▽ More

    Submitted 23 May, 2018; originally announced May 2018.

    Comments: 19 pages, published as a conference paper at ICLR 2018

  47. arXiv:1802.05869  [pdf, other

    cond-mat.mtrl-sci cond-mat.str-el quant-ph

    Carrier driven antiferromagnetism and exchange-bias in SrRuO3/CaRuO3 heterostructures

    Authors: Parul Pandey, Ching-Hao Chang, Angus Huang, Rakesh Rana, Changan Wang, Chi Xu, Horng-Tay Jeng, Manfred Helm, R. Ganesh, Shengqiang Zhou

    Abstract: Oxide heterostructures exhibit a rich variety of magnetic and transport properties which arise due to contact at an interface. This can lead to surprising effects that are very different from the bulk properties of the materials involved. We report the magnetic properties of bilayers of SrRuO3, a well known ferromagnet, and CaRuO3, which is nominally a paramagnet. We find intriguing features that… ▽ More

    Submitted 16 February, 2018; originally announced February 2018.

    Comments: 6 pages and 4 figures

  48. arXiv:1801.08322  [pdf, other

    cs.CV

    Phonocardiographic Sensing using Deep Learning for Abnormal Heartbeat Detection

    Authors: Siddique Latif, Muhammad Usman, Rajib Rana, Junaid Qadir

    Abstract: Cardiac auscultation involves expert interpretation of abnormalities in heart sounds using stethoscope. Deep learning based cardiac auscultation is of significant interest to the healthcare community as it can help reducing the burden of manual auscultation with automated detection of abnormal heartbeats. However, the problem of automatic cardiac auscultation is complicated due to the requirement… ▽ More

    Submitted 27 July, 2020; v1 submitted 25 January, 2018; originally announced January 2018.

    Journal ref: IEEE Sensors Journal 2018

  49. arXiv:1801.06353  [pdf, ps, other

    cs.CV cs.CL

    Transfer Learning for Improving Speech Emotion Classification Accuracy

    Authors: Siddique Latif, Rajib Rana, Shahzad Younis, Junaid Qadir, Julien Epps

    Abstract: The majority of existing speech emotion recognition research focuses on automatic emotion detection using training and testing data from same corpus collected under the same conditions. The performance of such systems has been shown to drop significantly in cross-corpus and cross-language scenarios. To address the problem, this paper exploits a transfer learning technique to improve the performanc… ▽ More

    Submitted 27 July, 2020; v1 submitted 19 January, 2018; originally announced January 2018.

    Comments: Proc. Interspeech 2018

  50. arXiv:1801.05141  [pdf, other

    stat.ML cs.CV

    Image denoising and restoration with CNN-LSTM Encoder Decoder with Direct Attention

    Authors: Kazi Nazmul Haque, Mohammad Abu Yousuf, Rajib Rana

    Abstract: Image denoising is always a challenging task in the field of computer vision and image processing. In this paper, we have proposed an encoder-decoder model with direct attention, which is capable of denoising and reconstruct highly corrupted images. Our model consists of an encoder and a decoder, where the encoder is a convolutional neural network and decoder is a multilayer Long Short-Term memory… ▽ More

    Submitted 16 January, 2018; originally announced January 2018.