Skip to main content

Showing 1–19 of 19 results for author: Masoud, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.01997  [pdf

    cs.CL cs.AI

    Exploring Combinatorial Problem Solving with Large Language Models: A Case Study on the Travelling Salesman Problem Using GPT-3.5 Turbo

    Authors: Mahmoud Masoud, Ahmed Abdelhay, Mohammed Elhenawy

    Abstract: Large Language Models (LLMs) are deep learning models designed to generate text based on textual input. Although researchers have been develo** these models for more complex tasks such as code generation and general reasoning, few efforts have explored how LLMs can be applied to combinatorial problems. In this research, we investigate the potential of LLMs to solve the Travelling Salesman Proble… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  2. arXiv:2310.16162  [pdf, other

    cs.LG

    Brainchop: Next Generation Web-Based Neuroimaging Application

    Authors: Mohamed Masoud, Pratyush Reddy, Farfalla Hu, Sergey Plis

    Abstract: Performing volumetric image processing directly within the browser, particularly with medical data, presents unprecedented challenges compared to conventional backend tools. These challenges arise from limitations inherent in browser environments, such as constrained computational resources and the availability of frontend machine learning libraries. Consequently, there is a shortage of neuroimagi… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  3. arXiv:2303.03915  [pdf, other

    cs.CL cs.AI

    The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset

    Authors: Hugo Laurençon, Lucile Saulnier, Thomas Wang, Christopher Akiki, Albert Villanova del Moral, Teven Le Scao, Leandro Von Werra, Chenghao Mou, Eduardo González Ponferrada, Huu Nguyen, Jörg Frohberg, Mario Šaško, Quentin Lhoest, Angelina McMillan-Major, Gerard Dupont, Stella Biderman, Anna Rogers, Loubna Ben allal, Francesco De Toni, Giada Pistilli, Olivier Nguyen, Somaieh Nikpoor, Maraim Masoud, Pierre Colombo, Javier de la Rosa , et al. (29 additional authors not shown)

    Abstract: As language models grow ever larger, the need for large-scale high-quality text datasets has never been more pressing, especially in multilingual settings. The BigScience workshop, a 1-year international and multidisciplinary initiative, was formed with the goal of researching and training large language models as a values-driven undertaking, putting issues of ethics, harm, and governance in the f… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

    Comments: NeurIPS 2022, Datasets and Benchmarks Track

    ACM Class: I.2.7

  4. arXiv:2211.05100  [pdf, other

    cs.CL

    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

    Authors: BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major , et al. (369 additional authors not shown)

    Abstract: Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access… ▽ More

    Submitted 27 June, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

  5. arXiv:2208.00932  [pdf, other

    cs.CL

    Masader Plus: A New Interface for Exploring +500 Arabic NLP Datasets

    Authors: Yousef Altaher, Ali Fadel, Mazen Alotaibi, Mazen Alyazidi, Mishari Al-Mutairi, Mutlaq Aldhbuiub, Abdulrahman Mosaibah, Abdelrahman Rezk, Abdulrazzaq Alhendi, Mazen Abo Shal, Emad A. Alghamdi, Maged S. Alshaibani, Jezia Zakraoui, Wafaa Mohammed, Kamel Gaanoun, Khalid N. Elmadani, Mustafa Ghaleb, Nouamane Tazi, Raed Alharbi, Maraim Masoud, Zaid Alyafeai

    Abstract: Masader (Alyafeai et al., 2021) created a metadata structure to be used for cataloguing Arabic NLP datasets. However, develo** an easy way to explore such a catalogue is a challenging task. In order to give the optimal experience for users and researchers exploring the catalogue, several design and user experience challenges must be resolved. Furthermore, user interactions with the website may p… ▽ More

    Submitted 1 August, 2022; originally announced August 2022.

  6. arXiv:2206.03216  [pdf, other

    cs.CY cs.AI cs.CL

    Data Governance in the Age of Large-Scale Data-Driven Language Technology

    Authors: Yacine Jernite, Huu Nguyen, Stella Biderman, Anna Rogers, Maraim Masoud, Valentin Danchev, Samson Tan, Alexandra Sasha Luccioni, Nishant Subramani, Gérard Dupont, Jesse Dodge, Kyle Lo, Zeerak Talat, Isaac Johnson, Dragomir Radev, Somaieh Nikpoor, Jörg Frohberg, Aaron Gokaslan, Peter Henderson, Rishi Bommasani, Margaret Mitchell

    Abstract: The recent emergence and adoption of Machine Learning technology, and specifically of Large Language Models, has drawn attention to the need for systematic and transparent management of language data. This work proposes an approach to global language data governance that attempts to organize data management amongst stakeholders, values, and rights. Our proposal is informed by prior work on distrib… ▽ More

    Submitted 2 November, 2022; v1 submitted 3 May, 2022; originally announced June 2022.

    Comments: 32 pages: Full paper and Appendices; Association for Computing Machinery, New York, NY, USA, 2206-2222

    Journal ref: Proceedings of 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22)

  7. arXiv:2201.10066  [pdf, other

    cs.CL cs.DB

    Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources

    Authors: Angelina McMillan-Major, Zaid Alyafeai, Stella Biderman, Kimbo Chen, Francesco De Toni, Gérard Dupont, Hady Elsahar, Chris Emezue, Alham Fikri Aji, Suzana Ilić, Nurulaqilla Khamis, Colin Leong, Maraim Masoud, Aitor Soroa, Pedro Ortiz Suarez, Zeerak Talat, Daniel van Strien, Yacine Jernite

    Abstract: In recent years, large-scale data collection efforts have prioritized the amount of data collected in order to improve the modeling capabilities of large language models. This prioritization, however, has resulted in concerns with respect to the rights of data subjects represented in data collections, particularly when considering the difficulty in interrogating these collections due to insufficie… ▽ More

    Submitted 24 January, 2022; originally announced January 2022.

    Comments: 8 pages plus appendix and references

  8. arXiv:2110.06744  [pdf, other

    cs.CL

    Masader: Metadata Sourcing for Arabic Text and Speech Data Resources

    Authors: Zaid Alyafeai, Maraim Masoud, Mustafa Ghaleb, Maged S. Al-shaibani

    Abstract: The NLP pipeline has evolved dramatically in the last few years. The first step in the pipeline is to find suitable annotated datasets to evaluate the tasks we are trying to solve. Unfortunately, most of the published datasets lack metadata annotations that describe their attributes. Not to mention, the absence of a public catalogue that indexes all the publicly available datasets related to speci… ▽ More

    Submitted 13 October, 2021; originally announced October 2021.

  9. arXiv:2110.03104  [pdf

    cs.LG cs.AI math.OC

    Hybrid Pointer Networks for Traveling Salesman Problems Optimization

    Authors: Ahmed Stohy, Heba-Tullah Abdelhakam, Sayed Ali, Mohammed Elhenawy, Abdallah A Hassan, Mahmoud Masoud, Sebastien Glaser, Andry Rakotonirainy

    Abstract: In this work, a novel idea is presented for combinatorial optimization problems, a hybrid network, which results in a superior outcome. We applied this method to graph pointer networks [1], expanding its capabilities to a higher level. We proposed a hybrid pointer network (HPN) to solve the travelling salesman problem trained by reinforcement learning. Furthermore, HPN builds upon graph pointer ne… ▽ More

    Submitted 13 October, 2021; v1 submitted 6 October, 2021; originally announced October 2021.

  10. arXiv:2010.06041  [pdf

    cs.CL

    Towards Machine Translation for the Kurdish Language

    Authors: Sina Ahmadi, Mariam Masoud

    Abstract: Machine translation is the task of translating texts from one language to another using computers. It has been one of the major tasks in natural language processing and computational linguistics and has been motivating to facilitate human communication. Kurdish, an Indo-European language, has received little attention in this realm due to the language being less-resourced. Therefore, in this paper… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: 12 pages - under review in the ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP)

  11. arXiv:2009.13398  [pdf, ps, other

    cs.CL

    Aspects of Terminological and Named Entity Knowledge within Rule-Based Machine Translation Models for Under-Resourced Neural Machine Translation Scenarios

    Authors: Daniel Torregrosa, Nivranshu Pasricha, Maraim Masoud, Bharathi Raja Chakravarthi, Juan Alonso, Noe Casas, Mihael Arcan

    Abstract: Rule-based machine translation is a machine translation paradigm where linguistic knowledge is encoded by an expert in the form of rules that translate text from source to target language. While this approach grants extensive control over the output of the system, the cost of formalising the needed linguistic knowledge is much higher than training a corpus-based system, where a machine learning ap… ▽ More

    Submitted 28 September, 2020; originally announced September 2020.

  12. A Review on Drivers Red Light Running Behavior Predictions and Technology Based Countermeasures

    Authors: Md Mostafizur Rahman Komol, Jack Pinnow, Mohammed Elhenawy, Shamsunnahar Yasmin, Mahmoud Masoud, Sebastien Glaser, Andry Rakotonirainy

    Abstract: Red light running at signalised intersections is a growing road safety issue worldwide, leading to the rapid development of advanced intelligent transportation technologies and countermeasures. However, existing studies have yet to summarise and present the effect of these technology based innovations in improving safety. This paper represents a comprehensive review of red light running behaviour… ▽ More

    Submitted 13 March, 2022; v1 submitted 15 August, 2020; originally announced August 2020.

    Comments: Published in IEEE ACCESS

    Journal ref: in IEEE Access, vol. 10, pp. 25309-25326, 2022

  13. arXiv:2007.15585  [pdf

    cs.CY physics.soc-ph

    Develo** a Novel Crowdsourcing Business Model for Micro-Mobility Ride-Sharing Systems: Methodology and Preliminary Results

    Authors: Mohammed Elhenawy, MD Mostafizur Rahman Komol, Huthaifa I. Ashqar, Mohammed Hamad Almannaa, Mahmoud Masoud, Hesham A. Rakha, Andry Rakotonirainy

    Abstract: Micro-mobility ride-sharing is an emerging technology that provides access to the transit system with minimum environmental impacts. Significant research is required to ensure that micro-mobility ride-sharing provides a better fulfilment of user needs. In this study, we propose a novel business model for the micro-mobility ride-sharing system where light vehicles such as electric scooters and elec… ▽ More

    Submitted 30 July, 2020; originally announced July 2020.

    Comments: Submitted to TRB 2021

  14. arXiv:2006.06941  [pdf

    cs.CY cs.LG eess.SY

    Vulnerable Road User Detection Using Smartphone Sensors and Recurrence Quantification Analysis

    Authors: Huthaifa I. Ashqar, Mohammed Elhenawy, Mahmoud Masoud, Andry Rakotonirainy, Hesham A. Rakha

    Abstract: With the fast advancements of the Autonomous Vehicle (AV) industry, detection of Vulnerable Road Users (VRUs) using smartphones is critical for safety applications of Cooperative Intelligent Transportation Systems (C-ITSs). This study explores the use of low-power smartphone sensors and the Recurrence Quantification Analysis (RQA) features for this task. These features are computed over a threshol… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

    Comments: Published in: 2019 IEEE Intelligent Transportation Systems Conference (ITSC)

    Journal ref: 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 2019, pp. 1054-1059

  15. arXiv:2006.04033  [pdf

    cs.CY cs.CV cs.LG

    A Comparative Analysis of E-Scooter and E-Bike Usage Patterns: Findings from the City of Austin, TX

    Authors: Mohammed Hamad Almannaa, Huthaifa I. Ashqar, Mohammed Elhenawy, Mahmoud Masoud, Andry Rakotonirainy, Hesham Rakha

    Abstract: E-scooter-sharing and e-bike-sharing systems are accommodating and easing the increased traffic in dense cities and are expanding considerably. However, these new micro-mobility transportation modes raise numerous operational and safety concerns. This study analyzes e-scooter and dockless e-bike sharing system user behavior. We investigate how average trip speed change depending on the day of the… ▽ More

    Submitted 6 June, 2020; originally announced June 2020.

    Comments: Submitted to the International Journal of Sustainable Transportation

  16. arXiv:2001.00009  [pdf, other

    cs.CL

    Deep Reinforced Self-Attention Masks for Abstractive Summarization (DR.SAS)

    Authors: Ankit Chadha, Mohamed Masoud

    Abstract: We present a novel architectural scheme to tackle the abstractive summarization problem based on the CNN/DMdataset which fuses Reinforcement Learning (RL) withUniLM, which is a pre-trained Deep Learning Model, to solve various natural language tasks. We have tested the limits of learning fine-grained attention in Transformers to improve the summarization quality. UniLM applies attention to the ent… ▽ More

    Submitted 29 December, 2019; originally announced January 2020.

  17. Topological Stability: a New Algorithm for Selecting The Nearest Neighbors in Non-Linear Dimensionality Reduction Techniques

    Authors: Mohammed Elhenawy, Mahmoud Masoud, Sebastian Glaser, Andry Rakotonirainy

    Abstract: In the machine learning field, dimensionality reduction is an important task. It mitigates the undesired properties of high-dimensional spaces to facilitate classification, compression, and visualization of high-dimensional data. During the last decade, researchers proposed many new (non-linear) techniques for dimensionality reduction. Most of these techniques are based on the intuition that data… ▽ More

    Submitted 16 November, 2019; v1 submitted 13 November, 2019; originally announced November 2019.

  18. arXiv:1812.06544  [pdf, other

    cs.CV cs.AI cs.LG

    Towards Robust Human Activity Recognition from RGB Video Stream with Limited Labeled Data

    Authors: Krishanu Sarker, Mohamed Masoud, Saeid Belkasim, Shihao Ji

    Abstract: Human activity recognition based on video streams has received numerous attentions in recent years. Due to lack of depth information, RGB video based activity recognition performs poorly compared to RGB-D video based solutions. On the other hand, acquiring depth information, inertia etc. is costly and requires special equipment, whereas RGB video streams are available in ordinary cameras. Hence, o… ▽ More

    Submitted 16 December, 2018; originally announced December 2018.

    Comments: To appear in ICMLA 2018

  19. arXiv:1810.11134  [pdf, other

    cs.DS

    Optimizing Capacitated Vehicle Scheduling with Time Windows: A Case Study of RMC Delivery

    Authors: Mohamed Masoud, Saeid Belkasim

    Abstract: Ready Mixed Concrete Delivery Problem (RMCDP) is a multi-objective multi-constraint dynamic combinatorial optimization problem. From the operational research prospective, it is a real life logistic problem that is hard to be solved with large instances. In RMCDP, there is a need to optimize the Ready Mixed Concrete ( RMC) delivery by predetermining an optimal schedule for the sites-trips assignmen… ▽ More

    Submitted 25 October, 2018; originally announced October 2018.