-
Visual Reasoning and Multi-Agent Approach in Multimodal Large Language Models (MLLMs): Solving TSP and mTSP Combinatorial Challenges
Authors:
Mohammed Elhenawy,
Ahmad Abutahoun,
Taqwa I. Alhadidi,
Ahmed Jaber,
Huthaifa I. Ashqar,
Shadi Jaradat,
Ahmed Abdelhay,
Sebastien Glaser,
Andry Rakotonirainy
Abstract:
Multimodal Large Language Models (MLLMs) harness comprehensive knowledge spanning text, images, and audio to adeptly tackle complex problems, including zero-shot in-context learning scenarios. This study explores the ability of MLLMs in visually solving the Traveling Salesman Problem (TSP) and Multiple Traveling Salesman Problem (mTSP) using images that portray point distributions on a two-dimensi…
▽ More
Multimodal Large Language Models (MLLMs) harness comprehensive knowledge spanning text, images, and audio to adeptly tackle complex problems, including zero-shot in-context learning scenarios. This study explores the ability of MLLMs in visually solving the Traveling Salesman Problem (TSP) and Multiple Traveling Salesman Problem (mTSP) using images that portray point distributions on a two-dimensional plane. We introduce a novel approach employing multiple specialized agents within the MLLM framework, each dedicated to optimizing solutions for these combinatorial challenges. Our experimental investigation includes rigorous evaluations across zero-shot settings and introduces innovative multi-agent zero-shot in-context scenarios. The results demonstrated that both multi-agent models. Multi-Agent 1, which includes the Initializer, Critic, and Scorer agents, and Multi-Agent 2, which comprises only the Initializer and Critic agents; significantly improved solution quality for TSP and mTSP problems. Multi-Agent 1 excelled in environments requiring detailed route refinement and evaluation, providing a robust framework for sophisticated optimizations. In contrast, Multi-Agent 2, focusing on iterative refinements by the Initializer and Critic, proved effective for rapid decision-making scenarios. These experiments yield promising outcomes, showcasing the robust visual reasoning capabilities of MLLMs in addressing diverse combinatorial problems. The findings underscore the potential of MLLMs as powerful tools in computational optimization, offering insights that could inspire further advancements in this promising field. Project link: https://github.com/ahmed-abdulhuy/Solving-TSP-and-mTSP-Combinatorial-Challenges-using-Visual-Reasoning-and-Multi-Agent-Approach-MLLMs-.git
△ Less
Submitted 26 June, 2024;
originally announced July 2024.
-
Object Detection using Oriented Window Learning Vi-sion Transformer: Roadway Assets Recognition
Authors:
Taqwa Alhadidi,
Ahmed Jaber,
Shadi Jaradat,
Huthaifa I Ashqar,
Mohammed Elhenawy
Abstract:
Object detection is a critical component of transportation systems, particularly for applications such as autonomous driving, traffic monitoring, and infrastructure maintenance. Traditional object detection methods often struggle with limited data and variability in object appearance. The Oriented Window Learning Vision Transformer (OWL-ViT) offers a novel approach by adapting window orientations…
▽ More
Object detection is a critical component of transportation systems, particularly for applications such as autonomous driving, traffic monitoring, and infrastructure maintenance. Traditional object detection methods often struggle with limited data and variability in object appearance. The Oriented Window Learning Vision Transformer (OWL-ViT) offers a novel approach by adapting window orientations to the geometry and existence of objects, making it highly suitable for detecting diverse roadway assets. This study leverages OWL-ViT within a one-shot learning framework to recognize transportation infrastructure components, such as traffic signs, poles, pavement, and cracks. This study presents a novel method for roadway asset detection using OWL-ViT. We conducted a series of experiments to evaluate the performance of the model in terms of detection consistency, semantic flexibility, visual context adaptability, resolution robustness, and impact of non-max suppression. The results demonstrate the high efficiency and reliability of the OWL-ViT across various scenarios, underscoring its potential to enhance the safety and efficiency of intelligent transportation systems.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
Eyeballing Combinatorial Problems: A Case Study of Using Multimodal Large Language Models to Solve Traveling Salesman Problems
Authors:
Mohammed Elhenawy,
Ahmed Abdelhay,
Taqwa I. Alhadidi,
Huthaifa I Ashqar,
Shadi Jaradat,
Ahmed Jaber,
Sebastien Glaser,
Andry Rakotonirainy
Abstract:
Multimodal Large Language Models (MLLMs) have demonstrated proficiency in processing di-verse modalities, including text, images, and audio. These models leverage extensive pre-existing knowledge, enabling them to address complex problems with minimal to no specific training examples, as evidenced in few-shot and zero-shot in-context learning scenarios. This paper investigates the use of MLLMs' vi…
▽ More
Multimodal Large Language Models (MLLMs) have demonstrated proficiency in processing di-verse modalities, including text, images, and audio. These models leverage extensive pre-existing knowledge, enabling them to address complex problems with minimal to no specific training examples, as evidenced in few-shot and zero-shot in-context learning scenarios. This paper investigates the use of MLLMs' visual capabilities to 'eyeball' solutions for the Traveling Salesman Problem (TSP) by analyzing images of point distributions on a two-dimensional plane. Our experiments aimed to validate the hypothesis that MLLMs can effectively 'eyeball' viable TSP routes. The results from zero-shot, few-shot, self-ensemble, and self-refine zero-shot evaluations show promising outcomes. We anticipate that these findings will inspire further exploration into MLLMs' visual reasoning abilities to tackle other combinatorial problems.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Multiple-Valued Logic Circuit Design and Data Transmission Intended for Embedded Systems
Authors:
Ramzi A. Jaber,
Lina Nimri,
Ali M. Haidar
Abstract:
This thesis proposes novel ternary circuits aiming to reduce energy to preserve battery consumption. The proposed designs include eight ternary logic gates, three ternary combinational circuits, and six Ternary Arithmetic Logic Units. This thesis applies the best tradeoff between reducing the number of used transistors, utilizing energy efficient transistor arrangements such as transmission gates,…
▽ More
This thesis proposes novel ternary circuits aiming to reduce energy to preserve battery consumption. The proposed designs include eight ternary logic gates, three ternary combinational circuits, and six Ternary Arithmetic Logic Units. This thesis applies the best tradeoff between reducing the number of used transistors, utilizing energy efficient transistor arrangements such as transmission gates, and applying the dual supply voltages to achieve its objective. The proposed designs are compared to the latest ternary circuits using the HSPICE simulator for different supply voltages, different temperatures, and different frequencies. Simulations are performed to prove the efficiency of the proposed designs. The results demonstrate the advantage of the proposed designs with a reduction of over 73 percent in terms of transistor count for the THA and over 88 percent in energy consumption for the STI, TNAND, TDecoder, TMUX, THA, and TMUL, respectively. Moreover, the noise immunity curve and Monte Carlo analysis for major process variations, TOX, CNT Diameter, CNT Count, and Channel length, were studied.
△ Less
Submitted 8 November, 2022;
originally announced November 2022.
-
Analyzing Community-aware Centrality Measures Using The Linear Threshold Model
Authors:
Stephany Rajeh,
Ali Yassin,
Ali Jaber,
Hocine Cherifi
Abstract:
Targeting influential nodes in complex networks allows fastening or hindering rumors, epidemics, and electric blackouts. Since communities are prevalent in real-world networks, community-aware centrality measures exploit this information to target influential nodes. Researches show that they compare favorably with classical measures that are agnostic about the community structure. Although the dif…
▽ More
Targeting influential nodes in complex networks allows fastening or hindering rumors, epidemics, and electric blackouts. Since communities are prevalent in real-world networks, community-aware centrality measures exploit this information to target influential nodes. Researches show that they compare favorably with classical measures that are agnostic about the community structure. Although the diffusion process is of prime importance, previous studies consider mainly the famous Susceptible-Infected-Recovered (SIR) epidemic propagation model. This work investigates the consistency of previous analyses using the popular Linear Threshold (LT) propagation model, which characterizes many spreading processes in our real life. We perform a comparative analysis of seven influential community-aware centrality measures on thirteen real-world networks. Overall, results show that Community-based Mediator, Comm Centrality, and Modularity Vitality outperform the other measures. Moreover, Community-based Mediator is more effective on a tight budget (i.e., a small fraction of initially activated nodes), while Comm Centrality and Modularity Vitality perform better with a medium to a high fraction of initially activated nodes.
△ Less
Submitted 30 January, 2022;
originally announced February 2022.
-
Causal Identification under Markov Equivalence
Authors:
Amin Jaber,
Jiji Zhang,
Elias Bareinboim
Abstract:
Assessing the magnitude of cause-and-effect relations is one of the central challenges found throughout the empirical sciences. The problem of identification of causal effects is concerned with determining whether a causal effect can be computed from a combination of observational data and substantive knowledge about the domain under investigation, which is formally expressed in the form of a caus…
▽ More
Assessing the magnitude of cause-and-effect relations is one of the central challenges found throughout the empirical sciences. The problem of identification of causal effects is concerned with determining whether a causal effect can be computed from a combination of observational data and substantive knowledge about the domain under investigation, which is formally expressed in the form of a causal graph. In many practical settings, however, the knowledge available for the researcher is not strong enough so as to specify a unique causal graph. Another line of investigation attempts to use observational data to learn a qualitative description of the domain called a Markov equivalence class, which is the collection of causal graphs that share the same set of observed features. In this paper, we marry both approaches and study the problem of causal identification from an equivalence class, represented by a partial ancestral graph (PAG). We start by deriving a set of graphical properties of PAGs that are carried over to its induced subgraphs. We then develop an algorithm to compute the effect of an arbitrary set of variables on an arbitrary outcome set. We show that the algorithm is strictly more powerful than the current state of the art found in the literature.
△ Less
Submitted 14 December, 2018;
originally announced December 2018.
-
Channel Coherence Classification with Frame-Shifting in Massive MIMO Systems
Authors:
Ahmad Abboud,
Oussama Habachi,
Ali Jaber,
Jean-Pierre Cances,
Vahid Meghdadi
Abstract:
This paper considers the uplink pilot overhead in a time division duplexing (TDD) massive Multiple Input Multiple Output (MIMO) mobile systems. A common scenario of conventional massive MIMO systems is a Base Station (BS) serving all user terminals (UTs) in the cell with the same TDD frame format that fits the coherence interval of the worst-case scenario of user mobility (e.g. a moving train with…
▽ More
This paper considers the uplink pilot overhead in a time division duplexing (TDD) massive Multiple Input Multiple Output (MIMO) mobile systems. A common scenario of conventional massive MIMO systems is a Base Station (BS) serving all user terminals (UTs) in the cell with the same TDD frame format that fits the coherence interval of the worst-case scenario of user mobility (e.g. a moving train with velocity 300 Km/s). Furthermore, the BS has to estimate all the channels each time-slot for all users even for those with long coherence intervals. In fact, within the same cell, sensors or pedestrian with low mobility UTs (e.g. moving 1.38 m/s) share the same short TDD frame and thus are obliged to upload their pilots each time-slot. The channel coherence interval of the pedestrian UTs with a carrier frequency of 1.9 GHz can be as long as 60 times that of the train passenger users. In other words, conventional techniques waste 59-uploaded pilot sequences for channel estimation. In this paper, we are aware of the resources waste due to various coherence intervals among different user mobility. We classify users based on their coherence interval length, and we propose to skip uploading pilots of UTs with large coherence intervals. Then, we shift frames with the same pilot reused sequence toward an empty pilot time-slot. Simulation results had proved that the proposed technique overcome the performance of conventional massive MIMO systems in both energy and spectral efficiency.
△ Less
Submitted 14 November, 2017; v1 submitted 28 October, 2017;
originally announced October 2017.
-
Morphology-based Entity and Relational Entity Extraction Framework for Arabic
Authors:
Amin Jaber,
Fadi A. Zaraket
Abstract:
Rule-based techniques to extract relational entities from documents allow users to specify desired entities with natural language questions, finite state automata, regular expressions and structured query language. They require linguistic and programming expertise and lack support for Arabic morphological analysis. We present a morphology-based entity and relational entity extraction framework for…
▽ More
Rule-based techniques to extract relational entities from documents allow users to specify desired entities with natural language questions, finite state automata, regular expressions and structured query language. They require linguistic and programming expertise and lack support for Arabic morphological analysis. We present a morphology-based entity and relational entity extraction framework for Arabic (MERF). MERF requires basic knowledge of linguistic features and regular expressions, and provides the ability to interactively specify Arabic morphological and synonymity features, tag types associated with regular expressions, and relations and code actions defined over matches of subexpressions. MERF constructs entities and relational entities from matches of the specifications. We evaluated MERF with several case studies. The results show that MERF requires shorter development time and effort compared to existing application specific techniques and produces reasonably accurate results within a reasonable overhead in run time.
△ Less
Submitted 18 November, 2018; v1 submitted 17 September, 2017;
originally announced September 2017.
-
On the topology effects in wireless sensor networks based prognostics and health management
Authors:
Ahmad Farhat,
Abdallah Makhoul,
Christophe Guyeux,
Rami Tawil,
Ali Jaber,
Abbas Hijazi
Abstract:
In this work, we consider the usage of wireless sensor networks (WSN) to monitor an area of interest, in order to diagnose on real time its state. Each sensor node forwards information about relevant features towards the sink where the data is processed. Nevertheless, energy conservation is a key issue in the design of such networks and once a sensor exhausts its resources, it will be dropped from…
▽ More
In this work, we consider the usage of wireless sensor networks (WSN) to monitor an area of interest, in order to diagnose on real time its state. Each sensor node forwards information about relevant features towards the sink where the data is processed. Nevertheless, energy conservation is a key issue in the design of such networks and once a sensor exhausts its resources, it will be dropped from the network. This will lead to broken links and data loss. It is therefore important to keep the network running for as long as possible by preserving the energy held by the nodes. Indeed, saving the quality of service (QoS) of a wireless sensor network for a long period is very important in order to ensure accurate data. Then, the area diagnosing will be more accurate. From another side, packet transmission is the phase that consumes the highest amount of energy comparing to other activities in the network. Therefore, we can see that the network topology has an important impact on energy efficiency, and thus on data and diagnosis accuracies. In this paper, we study and compare four network topologies: distributed, hierarchical, centralized, and decentralized topology and show their impact on the resulting estimation of diagnostics. We have used six diagnostic algorithms, to evaluate both prognostic and health management with the variation of type of topology in WSN.
△ Less
Submitted 20 August, 2017;
originally announced August 2017.
-
On the usefulness of information hiding techniques for wireless sensor networks security
Authors:
Rola Al-Sharif,
Christophe Guyeux,
Yousra Ahmed Fadil,
Abdallah Makhoul,
Ali Jaber
Abstract:
A wireless sensor network (WSN) typically consists of base stations and a large number of wireless sensors. The sensory data gathered from the whole network at a certain time snapshot can be visualized as an image. As a result, information hiding techniques can be applied to this "sensory data image". Steganography refers to the technology of hiding data into digital media without drawing any susp…
▽ More
A wireless sensor network (WSN) typically consists of base stations and a large number of wireless sensors. The sensory data gathered from the whole network at a certain time snapshot can be visualized as an image. As a result, information hiding techniques can be applied to this "sensory data image". Steganography refers to the technology of hiding data into digital media without drawing any suspicion, while steganalysis is the art of detecting the presence of steganography. This article provides a brief review of steganography and steganalysis applications for wireless sensor networks (WSNs). Then we show that the steganographic techniques are both related to sensed data authentication in wireless sensor networks, and when considering the attacker point of view, which has not yet been investigated in the literature. Our simulation results show that the sink level is unable to detect an attack carried out by the nsF5 algorithm on sensed data.
△ Less
Submitted 25 June, 2017;
originally announced June 2017.
-
Smart Massive MIMO: An Infrastructure toward 5th Generation Smart Cities Network
Authors:
Ahmad Abboud,
Jean-Pierre Cances,
Vahid Meghdadi,
Ali Jaber
Abstract:
On the Optimizing of Wireless Networks and toward improving the future 5th Generation mobile Network Infrastructure, we propose a novel infrastructure that can be the next Smart City Network. Our proposed Infrastructure takes into consideration most future demands and challenges, includes Capacity, Reliability, Scalability, and Flexibility. To deal with this issues we propose a wireless network in…
▽ More
On the Optimizing of Wireless Networks and toward improving the future 5th Generation mobile Network Infrastructure, we propose a novel infrastructure that can be the next Smart City Network. Our proposed Infrastructure takes into consideration most future demands and challenges, includes Capacity, Reliability, Scalability, and Flexibility. To deal with this issues we propose a wireless network infrastructure that is based on latest technologies of Massive MIMO systems. We further extend our infrastructure with many smart features, to be capable of co** with Cloud Computing, Smartphones, IoT and other intelligence-based services. The proposed infrastructure uses Network Functions Virtualization (NFV), Software-Defined Networking
(SDN), Virtual Antenna Arrays (VAA) and Joint Beamforming to afford flexibility. We further propose a Terminal-centric rather than a Cell-centric based Infrastructure, which optimize interference aware environment and lead to higher capacity and reliability. The new infrastructure includes multi-purpose nodes that run a Network Operating System (NOS). This node will afford a scalable and flexible cost effective and semi-distributed network resources. Other propositions that meet Power-Effective, Cost-Effective, and Scenery aware design are discussed. Keywords - Wireless Network Infrastructure, Massive MIMO, Joint Beamforming, Cloud-based Networks, NFV, SDN, Cloud Computing, Grid Computing, and Distributed Systems.
△ Less
Submitted 7 June, 2016;
originally announced June 2016.
-
Indoor Massive MIMO: Uplink Pilot Mitigation Using Channel State Information Map
Authors:
Ahmad Abboud,
Jean-Pierre Cances,
Ali H. Jaber,
Vahid Meghdadi
Abstract:
Massive MIMO brings both motivations and challenges to develop the 5th generation Mobile wireless technology. The promising number of users and the high bitrate offered per unit area are challenged by uplink pilot contamination due to pilot reuse and a limited number of orthogonal pilot sequences. This paper proposes a solution to mitigate uplink pilot contamination in an indoor scenario where mul…
▽ More
Massive MIMO brings both motivations and challenges to develop the 5th generation Mobile wireless technology. The promising number of users and the high bitrate offered per unit area are challenged by uplink pilot contamination due to pilot reuse and a limited number of orthogonal pilot sequences. This paper proposes a solution to mitigate uplink pilot contamination in an indoor scenario where multi-cell share the same pool of pilot sequences, that are supposed to be less than the number of users. This can be done by reducing uplink pilots using Channel State Information (CSI) prediction. The proposed method is based on machine learning approach, where a quantized version of Channel State Information (QCSI) is learned during estimation session and stored at the Base Station (BS) to be exploited for future CSI prediction. The learned QCSI are represented by a weighted directed graph, which is responsible to monitor and predict the CSI of User Terminals (UTs) in the local cell. We introduce an online learning algorithm to create and update this graph which we call CSI map. Simulation results show an increase in the downlink sum-rate and a significant feedback reduction.
△ Less
Submitted 30 April, 2016;
originally announced May 2016.