Skip to main content

Showing 1–16 of 16 results for author: :

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12793  [pdf, other

    cs.CL

    ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

    Authors: Team GLM, :, Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Diego Rojas, Guanyu Feng, Hanlin Zhao, Hanyu Lai, Hao Yu, Hongning Wang, Jiadai Sun, Jiajie Zhang, Jiale Cheng, Jiayi Gui, Jie Tang, **g Zhang, Juanzi Li, Lei Zhao, Lindong Wu, Lucen Zhong, Mingdao Liu, Minlie Huang , et al. (32 additional authors not shown)

    Abstract: We introduce ChatGLM, an evolving family of large language models that we have been develo** over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable models that are trained with all the insights and lessons gained from the preceding three generations of ChatGLM. To date, the GLM-4 models are pre-trained… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  2. arXiv:2406.11704  [pdf, other

    cs.CL cs.AI cs.LG

    Nemotron-4 340B Technical Report

    Authors: Nvidia, :, Bo Adler, Niket Agarwal, Ashwath Aithal, Dong H. Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan Cohen, Sirshak Das, Ayush Dattagupta, Olivier Delalleau, Leon Derczynski, Yi Dong, Daniel Egert, Ellie Evans, Aleksander Ficek, Denys Fridman, Shaona Ghosh, Boris Ginsburg, Igor Gitman, Tomasz Grzegorzek , et al. (58 additional authors not shown)

    Abstract: We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows distribution, modification, and use of the models and its outputs. These models perform competitively to open access models on a wide range of evaluation be… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  3. arXiv:2403.04652  [pdf, other

    cs.CL cs.AI

    Yi: Open Foundation Models by 01.AI

    Authors: 01. AI, :, Alex Young, Bei Chen, Chao Li, Chengen Huang, Ge Zhang, Guanwei Zhang, Heng Li, Jiangcheng Zhu, Jianqun Chen, **g Chang, Kaidong Yu, Peng Liu, Qiang Liu, Shawn Yue, Senbin Yang, Shiming Yang, Tao Yu, Wen Xie, Wenhao Huang, Xiaohui Hu, Xiaoyi Ren, Xinyao Niu, Pengcheng Nie , et al. (7 additional authors not shown)

    Abstract: We introduce the Yi model family, a series of language and multimodal models that demonstrate strong multi-dimensional capabilities. The Yi model family is based on 6B and 34B pretrained language models, then we extend them to chat models, 200K long context models, depth-upscaled models, and vision-language models. Our base models achieve strong performance on a wide range of benchmarks like MMLU,… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  4. arXiv:2401.02954  [pdf, other

    cs.CL cs.AI cs.LG

    DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

    Authors: DeepSeek-AI, :, Xiao Bi, Deli Chen, Guanting Chen, Shanhuang Chen, Damai Dai, Chengqi Deng, Honghui Ding, Kai Dong, Qiushi Du, Zhe Fu, Huazuo Gao, Kaige Gao, Wenjun Gao, Ruiqi Ge, Kang Guan, Daya Guo, Jianzhong Guo, Guangbo Hao, Zhewen Hao, Ying He, Wenjie Hu, Panpan Huang, Erhang Li , et al. (63 additional authors not shown)

    Abstract: The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over scaling LLMs. We delve into the study of scaling laws and present our distinctive findings that facilitate scaling of large scale models in two commonly used open-source configurations, 7B… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  5. Queer In AI: A Case Study in Community-Led Participatory AI

    Authors: Organizers Of QueerInAI, :, Anaelia Ovalle, Arjun Subramonian, Ashwin Singh, Claas Voelcker, Danica J. Sutherland, Davide Locatelli, Eva Breznik, Filip Klubička, Hang Yuan, Hetvi J, Huan Zhang, Jaidev Shriram, Kruno Lehman, Luca Soldaini, Maarten Sap, Marc Peter Deisenroth, Maria Leonor Pacheco, Maria Ryskina, Martin Mundt, Milind Agarwal, Nyx McLean, Pan Xu, A Pranav , et al. (26 additional authors not shown)

    Abstract: We present Queer in AI as a case study for community-led participatory design in AI. We examine how participatory design and intersectional tenets started and shaped this community's programs over the years. We discuss different challenges that emerged in the process, look at ways this organization has fallen short of operationalizing participatory and intersectional principles, and then assess th… ▽ More

    Submitted 8 June, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

    Comments: To appear at FAccT 2023

    Journal ref: 2023 ACM Conference on Fairness, Accountability, and Transparency

  6. arXiv:2211.05100  [pdf, other

    cs.CL

    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

    Authors: BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major , et al. (369 additional authors not shown)

    Abstract: Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access… ▽ More

    Submitted 27 June, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

  7. arXiv:2206.06444  [pdf

    cs.AI cs.CY stat.AP

    A method for comparing multiple imputation techniques: a case study on the U.S. National COVID Cohort Collaborative

    Authors: Elena Casiraghi, Rachel Wong, Margaret Hall, Ben Coleman, Marco Notaro, Michael D. Evans, Jena S. Tronieri, Hannah Blau, Bryan Laraway, Tiffany J. Callahan, Lauren E. Chan, Carolyn T. Bramante, John B. Buse, Richard A. Moffitt, Til Sturmer, Steven G. Johnson, Yu Raymond Shao, Justin Reese, Peter N. Robinson, Alberto Paccanaro, Giorgio Valentini, Jared D. Huling, Kenneth Wilkins, :, Tell Bennet , et al. (12 additional authors not shown)

    Abstract: Healthcare datasets obtained from Electronic Health Records have proven to be extremely useful to assess associations between patients' predictors and outcomes of interest. However, these datasets often suffer from missing values in a high proportion of cases and the simple removal of these cases may introduce severe bias. For these reasons, several multiple imputation algorithms have been propose… ▽ More

    Submitted 25 September, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

  8. arXiv:2112.13511  [pdf

    cs.RO

    Design, Manufacturing, and Controls of a Prismatic Quadruped Robot: PRISMA

    Authors: Team Robocon, IIT Roorkee, :, Bhavya Giri Goswami, Aman Verma, Gautam Jha, Vandan Gajjar, Vedant Neekhra, Utkarsh Deepak, Aayush Singh Chauhan

    Abstract: Most of the quadrupeds developed are highly actuated, and their control is hence quite cumbersome. They need advanced electronics equipment to solve convoluted inverse kinematic equations continuously. In addition, they demand special and costly sensors to autonomously navigate through the environment as traditional distance sensors usually fail because of the continuous perturbation due to the mo… ▽ More

    Submitted 26 December, 2021; originally announced December 2021.

    Comments: 14 pages, 16 figures, 4 tables

  9. arXiv:2012.13117  [pdf, other

    cs.DL cs.CY

    Nine Best Practices for Research Software Registries and Repositories: A Concise Guide

    Authors: Task Force on Best Practices for Software Registries, :, Alain Monteil, Alejandra Gonzalez-Beltran, Alexandros Ioannidis, Alice Allen, Allen Lee, Anita Bandrowski, Bruce E. Wilson, Bryce Mecum, Cai Fan Du, Carly Robinson, Daniel Garijo, Daniel S. Katz, David Long, Genevieve Milliken, Hervé Ménager, Jessica Hausman, Jurriaan H. Spaaks, Katrina Fenlon, Kristin Vanderbilt, Lorraine Hwang, Lynn Davis, Martin Fenner, Michael R. Crusoe , et al. (8 additional authors not shown)

    Abstract: Scientific software registries and repositories serve various roles in their respective disciplines. These resources improve software discoverability and research transparency, provide information for software citations, and foster preservation of computational methods that might otherwise be lost over time, thereby supporting research reproducibility and replicability. However, develo** these r… ▽ More

    Submitted 24 December, 2020; originally announced December 2020.

    Comments: 18 pages

  10. arXiv:2007.10970  [pdf

    cs.CY astro-ph.IM

    Recommendations for Planning Inclusive Astronomy Conferences

    Authors: Inclusive Astronomy 2 Local Organizing Committee, :, Brian Brooks, Keira Brooks, Lea Hagen, Nimish Hathi, Samantha Hoffman, James Paranilam, Laura Prichard

    Abstract: The Inclusive Astronomy (IA) conference series aims to create a safe space where community members can listen to the experiences of marginalized individuals in astronomy, discuss actions being taken to address inequities, and give recommendations to the community for how to improve diversity, equity, and inclusion in astronomy. The first IA was held in Nashville, TN, USA, 17-19 June, 2015. The Inc… ▽ More

    Submitted 21 July, 2020; originally announced July 2020.

    Comments: 41 pages. An editable version of the document and contact information available here: https://outerspace.stsci.edu/display/IA2/LOC+Recommendations

  11. arXiv:1912.06680  [pdf, other

    cs.LG stat.ML

    Dota 2 with Large Scale Deep Reinforcement Learning

    Authors: OpenAI, :, Christopher Berner, Greg Brockman, Brooke Chan, Vicki Cheung, Przemysław Dębiak, Christy Dennison, David Farhi, Quirin Fischer, Shariq Hashme, Chris Hesse, Rafal Józefowicz, Scott Gray, Catherine Olsson, Jakub Pachocki, Michael Petrov, Henrique P. d. O. Pinto, Jonathan Raiman, Tim Salimans, Jeremy Schlatter, Jonas Schneider, Szymon Sidor, Ilya Sutskever, Jie Tang , et al. (2 additional authors not shown)

    Abstract: On April 13th, 2019, OpenAI Five became the first AI system to defeat the world champions at an esports game. The game of Dota 2 presents novel challenges for AI systems such as long time horizons, imperfect information, and complex, continuous state-action spaces, all challenges which will become increasingly central to more capable AI systems. OpenAI Five leveraged existing reinforcement learnin… ▽ More

    Submitted 13 December, 2019; originally announced December 2019.

  12. INDIGO-DataCloud:A data and computing platform to facilitate seamless access to e-infrastructures

    Authors: INDIGO-DataCloud Collaboration, :, Davide Salomoni, Isabel Campos, Luciano Gaido, Jesus Marco de Lucas, Peter Solagna, Jorge Gomes, Ludek Matyska, Patrick Fuhrman, Marcus Hardt, Giacinto Donvito, Lukasz Dutka, Marcin Plociennik, Roberto Barbera, Ignacio Blanquer, Andrea Ceccanti, Mario David, Cristina Duma, Alvaro López-García, Germán Moltó, Pablo Orviz, Zdenek Sustr, Matthew Viljoen, Fernando Aguilar , et al. (40 additional authors not shown)

    Abstract: This paper describes the achievements of the H2020 project INDIGO-DataCloud. The project has provided e-infrastructures with tools, applications and cloud framework enhancements to manage the demanding requirements of scientific communities, either locally or through enhanced interfaces. The middleware developed allows to federate hybrid resources, to easily write, port and run scientific applicat… ▽ More

    Submitted 5 February, 2019; v1 submitted 6 November, 2017; originally announced November 2017.

    Comments: 39 pages, 15 figures.Version accepted in Journal of Grid Computing

  13. arXiv:1303.1051  [pdf

    cs.NE

    A Genetic algorithm to solve the container storage space allocation problem

    Authors: I. Ayachi, R. Kammarti, M. Ksouri, P. Borne, :, LAGIS, Ecole Centrale de Lille, :, LACS, Ecole Nationale des Ingenieurs de Tunis

    Abstract: This paper presented a genetic algorithm (GA) to solve the container storage problem in the port. This problem is studied with different container types such as regular, open side, open top, tank, empty and refrigerated containers. The objective of this problem is to determine an optimal containers arrangement, which respects customers delivery deadlines, reduces the rehandle operations of contain… ▽ More

    Submitted 5 March, 2013; originally announced March 2013.

    Comments: 2010 International conference on Computational Intelligence and Vehicular System (CIVS)

  14. arXiv:0806.1156  [pdf

    cs.LG

    Utilisation des grammaires probabilistes dans les tâches de segmentation et d'annotation prosodique

    Authors: Irina Nesterenko, Stéphane Rauzy

    Abstract: Nous présentons dans cette contribution une approche à la fois symbolique et probabiliste permettant d'extraire l'information sur la segmentation du signal de parole à partir d'information prosodique. Nous utilisons pour ce faire des grammaires probabilistes possédant une structure hiérarchique minimale. La phase de construction des grammaires ainsi que leur pouvoir de prédiction sont évalués qu… ▽ More

    Submitted 6 June, 2008; originally announced June 2008.

    Report number: 3267

    Journal ref: Journées d'Etudes sur la Parole, Avignon : France (2008)

  15. arXiv:0705.4415  [pdf

    cs.SE

    PERCEVAL: a Computer-Driven System for Experimentation on Auditory and Visual Perception

    Authors: Carine André, Alain Ghio, Christian Cavé, Bernard Teston

    Abstract: Since perception tests are highly time-consuming, there is a need to automate as many operations as possible, such as stimulus generation, procedure control, perception testing, and data analysis. The computer-driven system we are presenting here meets these objectives. To achieve large flexibility, the tests are controlled by scripts. The system's core software resembles that of a lexical-synta… ▽ More

    Submitted 30 May, 2007; originally announced May 2007.

    Report number: 1557

    Journal ref: Proceedings of International Congress of Phonetic Sciences (ICPhS) (2003) 1421-1424

  16. arXiv:cmp-lg/9506008  [pdf, ps

    cs.CL

    CLiFF Notes: Research in the Language, Information and Computation Laboratory of the University of Pennsylvania

    Authors: Editors, :, Matthew Stone, Libby Levison

    Abstract: Short abstracts by computational linguistics researchers at the University of Pennsylvania describing ongoing individual and joint projects.

    Submitted 9 June, 1995; originally announced June 1995.

    Comments: Annual Research Survey. 112 pages. uuencoded compressed postscript. Available as http://www.cis.upenn.edu/~cliff-group/94/cliffnotes.html

    Report number: Technical Report CIS 95-07