Skip to main content

Showing 1–15 of 15 results for author: Ram, D

.
  1. arXiv:2406.15570  [pdf, other

    cs.CL cs.LG

    DEM: Distribution Edited Model for Training with Mixed Data Distributions

    Authors: Dhananjay Ram, Aditya Rawal, Momchil Hardalov, Nikolaos Pappas, Sheng Zha

    Abstract: Training with mixed data distributions is a common and important part of creating multi-task and instruction-following models. The diversity of the data distributions and cost of joint training makes the optimization procedure extremely challenging. Data mixing methods partially address this problem, albeit having a sub-optimal performance across data sources and require multiple expensive trainin… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

  2. arXiv:2404.10630  [pdf, other

    cs.CL cs.LG

    HLAT: High-quality Large Language Model Pre-trained on AWS Trainium

    Authors: Haozheng Fan, Hao Zhou, Guangtai Huang, Parameswaran Raman, Xinwei Fu, Gaurav Gupta, Dhananjay Ram, Yida Wang, Jun Huan

    Abstract: Getting large language models (LLMs) to perform well on the downstream tasks requires pre-training over trillions of tokens. This typically demands a large number of powerful computational devices in addition to a stable distributed training framework to accelerate the training. The growing number of applications leveraging AI/ML had led to a scarcity of the expensive conventional accelerators (su… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  3. Magnetic, thermodynamic, and magnetotransport properties of CeGaGe and PrGaGe single crystals

    Authors: Daloo Ram, Sudip Malick, Zakir Hossain, Dariusz Kaczorowski

    Abstract: We investigate the physical properties of high-quality single crystals CeGaGe and PrGaGe using magnetization, heat capacity, and magnetotransport measurements. Gallium-indium binary flux was used to grow these single crystals that crystallize in a body-centered tetragonal structure. Magnetic susceptibility data reveal a magnetic phase transition around 6.0 and 19.4 K in CeGaGe and PrGaGe, respecti… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: 10 pages, 5 figures

    Journal ref: Phys. Rev. B 108, 024428 (2023)

  4. arXiv:2401.15464  [pdf, other

    cond-mat.str-el cond-mat.mes-hall

    Electronic structure and physical properties of candidate topological material GdAgGe

    Authors: D. Ram, J. Singh, M. K. Hooda, O. Pavlosiuk, V. Kanchana, Z. Hossain, D. Kaczorowski

    Abstract: We grew needle-shaped single crystals of GdAgGe, which crystallizes in a noncentrosymmetric hexagonal crystal structure with space group P$\overline{6}$2$m$ (189). The magnetic susceptibility data for $H \perp c$ reveal two pronounced antiferromagnetic transitions at $T_{N1}$ = 20 K and $T_{N2}$ = 14.5 K. The magnetic susceptibility anomalies are less prominent for $H \parallel c$. The transition… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

    Comments: 9 pages, 9 figures,

    Journal ref: Phys. Rev. B 107, 085137 (2023)

  5. arXiv:2312.10352  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci

    Multiple magnetic transitions, metamagnetism and large magnetoresistance in GdAuGe single crystals

    Authors: D. Ram, J. Singh, M. K. Hooda, K. Singh, V. Kanchana, D. Kaczorowski, Z. Hossain

    Abstract: We report the physical properties of GdAuGe single crystals, which were grown using Bi flux. The powder x-ray diffraction data shows that the compound crystallizes in hexagonal NdPtSb-type structure (space group P63mc). Magnetization measurements performed for field configuration H||c and H||ab show that GdAuGe orders antiferromagnetically at the Neel temperature, TN = 17.2 K. Around this temperat… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: 11 pages, 12 figures

    Journal ref: Phys. Rev. B 108, 235107, (2023)

  6. arXiv:2310.12442  [pdf, other

    cs.CL cs.LG

    Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer

    Authors: Qingru Zhang, Dhananjay Ram, Cole Hawkins, Sheng Zha, Tuo Zhao

    Abstract: Pretrained transformer models have demonstrated remarkable performance across various natural language processing tasks. These models leverage the attention mechanism to capture long- and short-range dependencies in the sequence. However, the (full) attention mechanism incurs high computational cost - quadratic in the sequence length, which is not affordable in tasks with long sequences, e.g., inp… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023 Findings)

  7. arXiv:2309.01084  [pdf, other

    astro-ph.SR astro-ph.EP

    Rotational Variability and Detection of Superflares in a Young Brown Dwarf by TESS

    Authors: Rajib Kumbhakar, Soumen Mondal, Samrat Ghosh, Diya Ram, Sudip Pramanik

    Abstract: We present a comprehensive analysis of a Transiting Exoplanet Survey Satellite (TESS) high-quality light curve for a young brown dwarf, MHO~4 having spectral type M7.0, in the Taurus star-forming region. We investigate the rotation periods and characterize the BD's dynamic atmosphere and surface features. We present light curve analysis of MHO~4, and estimate the rotation period to be around 2.224… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

    Comments: 10 pages, 5 figures

  8. Evolutionary Dynamics of Social Inequality and Coincidence of Gini and Kolkata indices under Unrestricted Competition

    Authors: Suchismita Banerjee, Soumyajyoti Biswas, Bikas K. Chakrabarti, Sai Krishna Challagundla, Asim Ghosh, Suhaas Reddy Guntaka, Hanesh Koganti, Anvesh Reddy Kondapalli, Raju Maiti, Manipushpak Mitra, Dachepalli R. S. Ram

    Abstract: Social inequalities are ubiquitous and here we show that the values of the Gini ($g$) and Kolkata ($k$) indices, two generic inequality indices, approach each other (starting from $g = 0$ and $k = 0.5$ for equality) as the competitions grow in various social institutions like markets, universities, elections, etc. It is further showed that these two indices become equal and stabilize at a value (a… ▽ More

    Submitted 4 October, 2022; v1 submitted 14 November, 2021; originally announced November 2021.

    Comments: 22 pages, 14 figures; International Journal of Modern Physics C (in press)

    Journal ref: International Journal of Modern Physics C (2023) 2350048

  9. arXiv:2109.14500  [pdf, ps, other

    physics.soc-ph cond-mat.stat-mech physics.data-an

    Scaling Behavior of the Hirsch Index for Failure Avalanches, Percolation Clusters and Paper Citations

    Authors: Asim Ghosh, Bikas K. Chakrabarti, Dachepalli R. S. Ram, Manipushpak Mitra, Raju Maiti, Soumyajyoti Biswas, Suchismita Banerjee

    Abstract: A popular measure for citation inequalities of individual scientists has been the Hirsch index ($h$). If for any scientist the number $n_c$ of citations is plotted against the serial number $n_p$ of the paper having those many citations (when the papers are ordered from highest cited to lowest) then $h$ corresponds to the nearest lower integer value of $n_p$ below the fixed point of the non-linear… ▽ More

    Submitted 24 October, 2022; v1 submitted 29 September, 2021; originally announced September 2021.

    Comments: 13 pages, 9 figures; Frontiers in Physics (in press)

  10. arXiv:1911.08332  [pdf, other

    eess.AS cs.CL cs.LG cs.SD stat.ML

    Neural Network based End-to-End Query by Example Spoken Term Detection

    Authors: Dhananjay Ram, Lesly Miculicich, Hervé Bourlard

    Abstract: This paper focuses on the problem of query by example spoken term detection (QbE-STD) in zero-resource scenario. State-of-the-art approaches primarily rely on dynamic time war** (DTW) based template matching techniques using phone posterior or bottleneck features extracted from a deep neural network (DNN). We use both monolingual and multilingual bottleneck features, and show that multilingual f… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

    Comments: Submitted to IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING

  11. arXiv:1907.00443  [pdf, other

    cs.CL cs.HC cs.LG cs.SD eess.AS

    Multilingual Bottleneck Features for Query by Example Spoken Term Detection

    Authors: Dhananjay Ram, Lesly Miculicich, Hervé Bourlard

    Abstract: State of the art solutions to query by example spoken term detection (QbE-STD) usually rely on bottleneck feature representation of the query and audio document to perform dynamic time war** (DTW) based template matching. Here, we present a study on QbE-STD performance using several monolingual as well as multilingual bottleneck features extracted from feed forward networks. Then, we propose to… ▽ More

    Submitted 30 June, 2019; originally announced July 2019.

  12. arXiv:1905.03324  [pdf, ps, other

    math.AP math.NA

    Mini-Max Algorithm via Pohozaev Manifold

    Authors: L. A. Maia, D. Raom, R. Ruviaro, Y. D. Sobral

    Abstract: A new algorithm for solving non-homogeneous asymptotically linear and superlinear problems is proposed. The ground state solution of the problem, which in general is obtained as a mini-max of the associated functional, is obtained as the minimum of the functional constrained to the Pohozaev manifold instead. Examples are given of the use of this method for finding numerical radially symmetric posi… ▽ More

    Submitted 8 May, 2019; originally announced May 2019.

    Comments: 26 pages

    MSC Class: 35J20; 35J61; 35J10; 65N99; 65N22

  13. arXiv:1809.01576  [pdf, other

    cs.CL

    Document-Level Neural Machine Translation with Hierarchical Attention Networks

    Authors: Lesly Miculicich, Dhananjay Ram, Nikolaos Pappas, James Henderson

    Abstract: Neural Machine Translation (NMT) can be improved by including document-level contextual information. For this purpose, we propose a hierarchical attention model to capture the context in a structured and dynamic manner. The model is integrated in the original NMT architecture as another level of abstraction, conditioning on the NMT model's own previous hidden states. Experiments show that hierarch… ▽ More

    Submitted 1 October, 2018; v1 submitted 5 September, 2018; originally announced September 2018.

    Comments: EMNLP 2018

  14. Self-Attentive Residual Decoder for Neural Machine Translation

    Authors: Lesly Miculicich Werlen, Nikolaos Pappas, Dhananjay Ram, Andrei Popescu-Belis

    Abstract: Neural sequence-to-sequence networks with attention have achieved remarkable performance for machine translation. One of the reasons for their effectiveness is their ability to capture relevant source-side contextual information at each time-step prediction through an attention mechanism. However, the target-side context is solely based on the sequence model which, in practice, is prone to a recen… ▽ More

    Submitted 1 October, 2018; v1 submitted 14 September, 2017; originally announced September 2017.

    Comments: Accepted on NAACL-HLT 2018, Volume: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

  15. arXiv:1610.05948  [pdf, ps, other

    cs.SD cs.CL stat.AP

    A Bayesian Approach to Estimation of Speaker Normalization Parameters

    Authors: Dhananjay Ram, Debasis Kundu, Rajesh M. Hegde

    Abstract: In this work, a Bayesian approach to speaker normalization is proposed to compensate for the degradation in performance of a speaker independent speech recognition system. The speaker normalization method proposed herein uses the technique of vocal tract length normalization (VTLN). The VTLN parameters are estimated using a novel Bayesian approach which utilizes the Gibbs sampler, a special type o… ▽ More

    Submitted 19 October, 2016; originally announced October 2016.

    Comments: 23 Pages, 9 Figures