Skip to main content

Showing 1–16 of 16 results for author: Rana, A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2401.07360  [pdf, other

    cs.CL cs.SD eess.AS

    Promptformer: Prompted Conformer Transducer for ASR

    Authors: Sergio Duarte-Torres, Arunasish Sen, Aman Rana, Lukas Drude, Alejandro Gomez-Alanis, Andreas Schwarz, Leif Rädel, Volker Leutnant

    Abstract: Context cues carry information which can improve multi-turn interactions in automatic speech recognition (ASR) systems. In this paper, we introduce a novel mechanism inspired by hyper-prompting to fuse textual context with acoustic representations in the attention mechanism. Results on a test set with multi-turn interactions show that our method achieves 5.9% relative word error rate reduction (rW… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

  2. arXiv:2109.10443  [pdf, other

    cs.RO eess.SY

    Geometric Fabrics: Generalizing Classical Mechanics to Capture the Physics of Behavior

    Authors: Karl Van Wyk, Mandy Xie, Anqi Li, Muhammad Asif Rana, Buck Babich, Bryan Peele, Qian Wan, Iretiayo Akinola, Balakumar Sundaralingam, Dieter Fox, Byron Boots, Nathan D. Ratliff

    Abstract: Classical mechanical systems are central to controller design in energy sha** methods of geometric control. However, their expressivity is limited by position-only metrics and the intimate link between metric and geometry. Recent work on Riemannian Motion Policies (RMPs) has shown that shedding these restrictions results in powerful design tools, but at the expense of theoretical stability guara… ▽ More

    Submitted 18 January, 2022; v1 submitted 21 September, 2021; originally announced September 2021.

  3. arXiv:2105.07962  [pdf

    eess.IV cs.CV eess.SP

    DFENet: A Novel Dimension Fusion Edge Guided Network for Brain MRI Segmentation

    Authors: Hritam Basak, Rukhshanda Hussain, Ajay Rana

    Abstract: The rapid increment of morbidity of brain stroke in the last few years have been a driving force towards fast and accurate segmentation of stroke lesions from brain MRI images. With the recent development of deep-learning, computer-aided and segmentation methods of ischemic stroke lesions have been useful for clinicians in early diagnosis and treatment planning. However, most of these methods suff… ▽ More

    Submitted 22 October, 2021; v1 submitted 17 May, 2021; originally announced May 2021.

    Comments: Submitted at SN Computer Science

  4. arXiv:2103.05922  [pdf, other

    cs.RO cs.LG eess.SY

    RMP2: A Structured Composable Policy Class for Robot Learning

    Authors: Anqi Li, Ching-An Cheng, M. Asif Rana, Man Xie, Karl Van Wyk, Nathan Ratliff, Byron Boots

    Abstract: We consider the problem of learning motion policies for acceleration-based robotics systems with a structured policy class specified by RMPflow. RMPflow is a multi-task control framework that has been successfully applied in many robotics problems. Using RMPflow as a structured policy class in learning has several benefits, such as sufficient expressiveness, the flexibility to inject different lev… ▽ More

    Submitted 10 March, 2021; originally announced March 2021.

  5. arXiv:2101.10396  [pdf, other

    eess.IV cs.CV

    Quality Assessment of Super-Resolved Omnidirectional Image Quality Using Tangential Views

    Authors: Cagri Ozcinar, Aakanksha Rana

    Abstract: Omnidirectional images (ODIs), also known as 360-degree images, enable viewers to explore all directions of a given 360-degree scene from a fixed point. Designing an immersive imaging system with ODI is challenging as such systems require very large resolution coverage of the entire 360 viewing space to provide an enhanced quality of experience (QoE). Despite remarkable progress on single image su… ▽ More

    Submitted 25 January, 2021; originally announced January 2021.

    Comments: Paper Accepted at Electronic Imaging

  6. arXiv:2010.12065  [pdf

    q-bio.QM cs.CV cs.LG eess.IV

    A generalized deep learning model for multi-disease Chest X-Ray diagnostics

    Authors: Nabit Bajwa, Kedar Bajwa, Atif Rana, M. Faique Shakeel, Kashif Haqqi, Suleiman Ali Khan

    Abstract: We investigate the generalizability of deep convolutional neural network (CNN) on the task of disease classification from chest x-rays collected over multiple sites. We systematically train the model using datasets from three independent sites with different patient populations: National Institute of Health (NIH), Stanford University Medical Centre (CheXpert), and Shifa International Hospital (SIH… ▽ More

    Submitted 17 October, 2020; originally announced October 2020.

  7. arXiv:2008.01116  [pdf, other

    eess.IV cs.CV

    Sub-Pixel Back-Projection Network For Lightweight Single Image Super-Resolution

    Authors: Supratik Banerjee, Cagri Ozcinar, Aakanksha Rana, Aljosa Smolic, Michael Manzke

    Abstract: Convolutional neural network (CNN)-based methods have achieved great success for single-image superresolution (SISR). However, most models attempt to improve reconstruction accuracy while increasing the requirement of number of model parameters. To tackle this problem, in this paper, we study reducing the number of parameters and computational cost of CNN-based SISR methods while maintaining the a… ▽ More

    Submitted 3 August, 2020; originally announced August 2020.

    Comments: To appear in IMVIP 2020

  8. arXiv:2005.13143  [pdf, other

    cs.RO cs.LG eess.SY

    Euclideanizing Flows: Diffeomorphic Reduction for Learning Stable Dynamical Systems

    Authors: Muhammad Asif Rana, Anqi Li, Dieter Fox, Byron Boots, Fabio Ramos, Nathan Ratliff

    Abstract: Robotic tasks often require motions with complex geometric structures. We present an approach to learn such motions from a limited number of human demonstrations by exploiting the regularity properties of human motions e.g. stability, smoothness, and boundedness. The complex motions are encoded as rollouts of a stable dynamical system, which, under a change of coordinates defined by a diffeomorphi… ▽ More

    Submitted 21 September, 2020; v1 submitted 26 May, 2020; originally announced May 2020.

    Comments: 2nd Annual Conference on Learning for Dynamics and Control (L4DC) 2020 -- Revised Version

  9. arXiv:2004.11475  [pdf, other

    cs.CV eess.IV

    Gabriella: An Online System for Real-Time Activity Detection in Untrimmed Security Videos

    Authors: Mamshad Nayeem Rizve, Ugur Demir, Praveen Tirupattur, Aayush Jung Rana, Kevin Duarte, Ishan Dave, Yogesh Singh Rawat, Mubarak Shah

    Abstract: Activity detection in security videos is a difficult problem due to multiple factors such as large field of view, presence of multiple activities, varying scales and viewpoints, and its untrimmed nature. The existing research in activity detection is mainly focused on datasets, such as UCF-101, JHMDB, THUMOS, and AVA, which partially address these issues. The requirement of processing the security… ▽ More

    Submitted 19 May, 2020; v1 submitted 23 April, 2020; originally announced April 2020.

    Comments: 9 pages

  10. arXiv:2004.10445  [pdf, other

    math.OC eess.IV

    RESIRE: real space iterative reconstruction engine for Tomography

    Authors: Minh Pham, Yakun Yuan, Arjun Rana, Jianwei Miao, Stanley Osher

    Abstract: Tomography has made a revolutionary impact on diverse fields, ranging from macro-/mesoscopic scale studies in biology, radiology, plasma physics to the characterization of 3D atomic structure in material science. The fundamental of tomography is to reconstruct a 3D object from a set of 2D projections. To solve the tomography problem, many algorithms have been developed. Among them are methods usin… ▽ More

    Submitted 25 April, 2020; v1 submitted 22 April, 2020; originally announced April 2020.

  11. arXiv:1908.08505  [pdf, other

    cs.MM cs.GR cs.LG eess.IV

    ColorNet -- Estimating Colorfulness in Natural Images

    Authors: Emin Zerman, Aakanksha Rana, Aljosa Smolic

    Abstract: Measuring the colorfulness of a natural or virtual scene is critical for many applications in image processing field ranging from capturing to display. In this paper, we propose the first deep learning-based colorfulness estimation metric. For this purpose, we develop a color rating model which simultaneously learns to extracts the pertinent characteristic color features and the map** from featu… ▽ More

    Submitted 22 August, 2019; originally announced August 2019.

    Comments: Accepted to IEEE International Conference on Image Processing (ICIP) 2019

  12. arXiv:1908.06752  [pdf, other

    cs.SD cs.CV cs.LG cs.MM eess.AS

    Towards Generating Ambisonics Using Audio-Visual Cue for Virtual Reality

    Authors: Aakanksha Rana, Cagri Ozcinar, Aljoscha Smolic

    Abstract: Ambisonics i.e., a full-sphere surround sound, is quintessential with 360-degree visual content to provide a realistic virtual reality (VR) experience. While 360-degree visual content capture gained a tremendous boost recently, the estimation of corresponding spatial sound is still challenging due to the required sound-field microphones or information about the sound-source locations. In this pape… ▽ More

    Submitted 16 August, 2019; originally announced August 2019.

    Comments: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

  13. arXiv:1908.04297  [pdf, other

    cs.CV cs.LG cs.MM eess.IV

    Super-resolution of Omnidirectional Images Using Adversarial Learning

    Authors: Cagri Ozcinar, Aakanksha Rana, Aljosa Smolic

    Abstract: An omnidirectional image (ODI) enables viewers to look in every direction from a fixed point through a head-mounted display providing an immersive experience compared to that of a standard image. Designing immersive virtual reality systems with ODIs is challenging as they require high resolution content. In this paper, we study super-resolution for ODIs and propose an improved generative adversari… ▽ More

    Submitted 12 August, 2019; originally announced August 2019.

  14. arXiv:1908.04197  [pdf, other

    eess.IV cs.CV cs.GR

    Deep Tone Map** Operator for High Dynamic Range Images

    Authors: Aakanksha Rana, Praveer Singh, Giuseppe Valenzise, Frederic Dufaux, Nikos Komodakis, Aljosa Smolic

    Abstract: A computationally fast tone map** operator (TMO) that can quickly adapt to a wide spectrum of high dynamic range (HDR) content is quintessential for visualization on varied low dynamic range (LDR) output devices such as movie screens or standard displays. Existing TMOs can successfully tone-map only a limited number of HDR content and require an extensive parameter tuning to yield the best subje… ▽ More

    Submitted 12 August, 2019; originally announced August 2019.

  15. High Accuracy Tumor Diagnoses and Benchmarking of Hematoxylin and Eosin Stained Prostate Core Biopsy Images Generated by Explainable Deep Neural Networks

    Authors: Aman Rana, Alarice Lowe, Marie Lithgow, Katharine Horback, Tyler Janovitz, Annacarolina Da Silva, Harrison Tsai, Vignesh Shanmugam, Hyung-** Yoon, Pratik Shah

    Abstract: Histopathological diagnoses of tumors in tissue biopsy after Hematoxylin and Eosin (H&E) staining is the gold standard for oncology care. H&E staining is slow and uses dyes, reagents and precious tissue samples that cannot be reused. Thousands of native nonstained RGB Whole Slide Image (RWSI) patches of prostate core tissue biopsies were registered with their H&E stained versions. Conditional Gene… ▽ More

    Submitted 2 August, 2019; originally announced August 2019.

    Journal ref: JAMA Network. 2020;3(5):e205111

  16. arXiv:1906.01875  [pdf, other

    eess.IV math.OC

    A semi-implicit relaxed Douglas-Rachford algorithm (sir-DR) for Ptychograhpy

    Authors: Minh Pham, Arjun Rana, Jianwei Miao, Stanley Osher

    Abstract: Alternating projection based methods, such as ePIE and rPIE, have been used widely in ptychography. However, they only work well if there are adequate measurements (diffraction patterns); in the case of sparse data (i.e. fewer measurements) alternating projection underperforms and might not even converge. In this paper, we propose semi-implicit relaxed Douglas Rachford (sir-DR), an accelerated ite… ▽ More

    Submitted 5 June, 2019; originally announced June 2019.