-
Optimizing Nepali PDF Extraction: A Comparative Study of Parser and OCR Technologies
Authors:
Prabin Paudel,
Supriya Khadka,
Ranju G. C.,
Rahul Shah
Abstract:
This research compares PDF parsing and Optical Character Recognition (OCR) methods for extracting Nepali content from PDFs. PDF parsing offers fast and accurate extraction but faces challenges with non-Unicode Nepali fonts. OCR, specifically PyTesseract, overcomes these challenges, providing versatility for both digital and scanned PDFs. The study reveals that while PDF parsers are faster, their a…
▽ More
This research compares PDF parsing and Optical Character Recognition (OCR) methods for extracting Nepali content from PDFs. PDF parsing offers fast and accurate extraction but faces challenges with non-Unicode Nepali fonts. OCR, specifically PyTesseract, overcomes these challenges, providing versatility for both digital and scanned PDFs. The study reveals that while PDF parsers are faster, their accuracy fluctuates based on PDF types. In contrast, OCRs, with a focus on PyTesseract, demonstrate consistent accuracy at the expense of slightly longer extraction times. Considering the project's emphasis on Nepali PDFs, PyTesseract emerges as the most suitable library, balancing extraction speed and accuracy.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Scaling Data-Driven Building Energy Modelling using Large Language Models
Authors:
Sunil Khadka,
Liang Zhang
Abstract:
Building Management System (BMS) through a data-driven method always faces data and model scalability issues. We propose a methodology to tackle the scalability challenges associated with the development of data-driven models for BMS by using Large Language Models (LLMs). LLMs' code generation adaptability can enable broader adoption of BMS by "automating the automation," particularly the data han…
▽ More
Building Management System (BMS) through a data-driven method always faces data and model scalability issues. We propose a methodology to tackle the scalability challenges associated with the development of data-driven models for BMS by using Large Language Models (LLMs). LLMs' code generation adaptability can enable broader adoption of BMS by "automating the automation," particularly the data handling and data-driven modeling processes. In this paper, we use LLMs to generate code that processes structured data from BMS and build data-driven models for BMS's specific requirements. This eliminates the need for manual data and model development, reducing the time, effort, and cost associated with this process. Our hypothesis is that LLMs can incorporate domain knowledge about data science and BMS into data processing and modeling, ensuring that the data-driven modeling is automated for specific requirements of different building types and control objectives, which also improves accuracy and scalability. We generate a prompt template following the framework of Machine Learning Operations so that the prompts are designed to systematically generate Python code for data-driven modeling. Our case study indicates that bi-sequential prompting under the prompt template can achieve a high success rate of code generation and code accuracy, and significantly reduce human labor costs.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Learning Intrinsic Symbolic Rewards in Reinforcement Learning
Authors:
Hassam Sheikh,
Shauharda Khadka,
Santiago Miret,
Somdeb Majumdar
Abstract:
Learning effective policies for sparse objectives is a key challenge in Deep Reinforcement Learning (RL). A common approach is to design task-related dense rewards to improve task learnability. While such rewards are easily interpreted, they rely on heuristics and domain expertise. Alternate approaches that train neural networks to discover dense surrogate rewards avoid heuristics, but are high-di…
▽ More
Learning effective policies for sparse objectives is a key challenge in Deep Reinforcement Learning (RL). A common approach is to design task-related dense rewards to improve task learnability. While such rewards are easily interpreted, they rely on heuristics and domain expertise. Alternate approaches that train neural networks to discover dense surrogate rewards avoid heuristics, but are high-dimensional, black-box solutions offering little interpretability. In this paper, we present a method that discovers dense rewards in the form of low-dimensional symbolic trees - thus making them more tractable for analysis. The trees use simple functional operators to map an agent's observations to a scalar reward, which then supervises the policy gradient learning of a neural network policy. We test our method on continuous action spaces in Mujoco and discrete action spaces in Atari and Pygame environments. We show that the discovered dense rewards are an effective signal for an RL policy to solve the benchmark tasks. Notably, we significantly outperform a widely used, contemporary neural-network based reward-discovery algorithm in all environments considered.
△ Less
Submitted 9 October, 2020; v1 submitted 7 October, 2020;
originally announced October 2020.
-
Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning
Authors:
Shauharda Khadka,
Estelle Aflalo,
Mattias Marder,
Avrech Ben-David,
Santiago Miret,
Shie Mannor,
Tamir Hazan,
Hanlin Tang,
Somdeb Majumdar
Abstract:
For deep neural network accelerators, memory movement is both energetically expensive and can bound computation. Therefore, optimal map** of tensors to memory hierarchies is critical to performance. The growing complexity of neural networks calls for automated memory map** instead of manual heuristic approaches; yet the search space of neural network computational graphs have previously been p…
▽ More
For deep neural network accelerators, memory movement is both energetically expensive and can bound computation. Therefore, optimal map** of tensors to memory hierarchies is critical to performance. The growing complexity of neural networks calls for automated memory map** instead of manual heuristic approaches; yet the search space of neural network computational graphs have previously been prohibitively large. We introduce Evolutionary Graph Reinforcement Learning (EGRL), a method designed for large search spaces, that combines graph neural networks, reinforcement learning, and evolutionary search. A set of fast, stateless policies guide the evolutionary search to improve its sample-efficiency. We train and validate our approach directly on the Intel NNP-I chip for inference. EGRL outperforms policy-gradient, evolutionary search and dynamic programming baselines on BERT, ResNet-101 and ResNet-50. We additionally achieve 28-78\% speed-up compared to the native NNP-I compiler on all three workloads.
△ Less
Submitted 15 October, 2020; v1 submitted 14 July, 2020;
originally announced July 2020.
-
Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination
Authors:
Shauharda Khadka,
Somdeb Majumdar,
Santiago Miret,
Stephen McAleer,
Kagan Tumer
Abstract:
Many cooperative multiagent reinforcement learning environments provide agents with a sparse team-based reward, as well as a dense agent-specific reward that incentivizes learning basic skills. Training policies solely on the team-based reward is often difficult due to its sparsity. Furthermore, relying solely on the agent-specific reward is sub-optimal because it usually does not capture the team…
▽ More
Many cooperative multiagent reinforcement learning environments provide agents with a sparse team-based reward, as well as a dense agent-specific reward that incentivizes learning basic skills. Training policies solely on the team-based reward is often difficult due to its sparsity. Furthermore, relying solely on the agent-specific reward is sub-optimal because it usually does not capture the team coordination objective. A common approach is to use reward sha** to construct a proxy reward by combining the individual rewards. However, this requires manual tuning for each environment. We introduce Multiagent Evolutionary Reinforcement Learning (MERL), a split-level training platform that handles the two objectives separately through two optimization processes. An evolutionary algorithm maximizes the sparse team-based objective through neuroevolution on a population of teams. Concurrently, a gradient-based optimizer trains policies to only maximize the dense agent-specific rewards. The gradient-based policies are periodically added to the evolutionary population as a way of information transfer between the two optimization processes. This enables the evolutionary algorithm to use skills learned via the agent-specific rewards toward optimizing the global objective. Results demonstrate that MERL significantly outperforms state-of-the-art methods, such as MADDPG, on a number of difficult coordination benchmarks.
△ Less
Submitted 11 June, 2020; v1 submitted 17 June, 2019;
originally announced June 2019.
-
Collaborative Evolutionary Reinforcement Learning
Authors:
Shauharda Khadka,
Somdeb Majumdar,
Tarek Nassar,
Zach Dwiel,
Evren Tumer,
Santiago Miret,
Yinyin Liu,
Kagan Tumer
Abstract:
Deep reinforcement learning algorithms have been successfully applied to a range of challenging control tasks. However, these methods typically struggle with achieving effective exploration and are extremely sensitive to the choice of hyperparameters. One reason is that most approaches use a noisy version of their operating policy to explore - thereby limiting the range of exploration. In this pap…
▽ More
Deep reinforcement learning algorithms have been successfully applied to a range of challenging control tasks. However, these methods typically struggle with achieving effective exploration and are extremely sensitive to the choice of hyperparameters. One reason is that most approaches use a noisy version of their operating policy to explore - thereby limiting the range of exploration. In this paper, we introduce Collaborative Evolutionary Reinforcement Learning (CERL), a scalable framework that comprises a portfolio of policies that simultaneously explore and exploit diverse regions of the solution space. A collection of learners - typically proven algorithms like TD3 - optimize over varying time-horizons leading to this diverse portfolio. All learners contribute to and use a shared replay buffer to achieve greater sample efficiency. Computational resources are dynamically distributed to favor the best learners as a form of online algorithm selection. Neuroevolution binds this entire process to generate a single emergent learner that exceeds the capabilities of any individual learner. Experiments in a range of continuous control benchmarks demonstrate that the emergent learner significantly outperforms its composite learners while remaining overall more sample-efficient - notably solving the Mujoco Humanoid benchmark where all of its composite learners (TD3) fail entirely in isolation.
△ Less
Submitted 6 May, 2019; v1 submitted 2 May, 2019;
originally announced May 2019.
-
Artificial Intelligence for Prosthetics - challenge solutions
Authors:
Łukasz Kidziński,
Carmichael Ong,
Sharada Prasanna Mohanty,
Jennifer Hicks,
Sean F. Carroll,
Bo Zhou,
Hongsheng Zeng,
Fan Wang,
Rongzhong Lian,
Hao Tian,
Wojciech Jaśkowski,
Garrett Andersen,
Odd Rune Lykkebø,
Nihat Engin Toklu,
Pranav Shyam,
Rupesh Kumar Srivastava,
Sergey Kolesnikov,
Oleksii Hrinchuk,
Anton Pechenko,
Mattias Ljungström,
Zhen Wang,
Xu Hu,
Zehong Hu,
Minghui Qiu,
Jun Huang
, et al. (25 additional authors not shown)
Abstract:
In the NeurIPS 2018 Artificial Intelligence for Prosthetics challenge, participants were tasked with building a controller for a musculoskeletal model with a goal of matching a given time-varying velocity vector. Top participants were invited to describe their algorithms. In this work, we describe the challenge and present thirteen solutions that used deep reinforcement learning approaches. Many s…
▽ More
In the NeurIPS 2018 Artificial Intelligence for Prosthetics challenge, participants were tasked with building a controller for a musculoskeletal model with a goal of matching a given time-varying velocity vector. Top participants were invited to describe their algorithms. In this work, we describe the challenge and present thirteen solutions that used deep reinforcement learning approaches. Many solutions use similar relaxations and heuristics, such as reward sha**, frame skip**, discretization of the action space, symmetry, and policy blending. However, each team implemented different modifications of the known algorithms by, for example, dividing the task into subtasks, learning low-level control, or by incorporating expert knowledge and using imitation learning.
△ Less
Submitted 6 February, 2019;
originally announced February 2019.
-
Evolution-Guided Policy Gradient in Reinforcement Learning
Authors:
Shauharda Khadka,
Kagan Tumer
Abstract:
Deep Reinforcement Learning (DRL) algorithms have been successfully applied to a range of challenging control tasks. However, these methods typically suffer from three core difficulties: temporal credit assignment with sparse rewards, lack of effective exploration, and brittle convergence properties that are extremely sensitive to hyperparameters. Collectively, these challenges severely limit the…
▽ More
Deep Reinforcement Learning (DRL) algorithms have been successfully applied to a range of challenging control tasks. However, these methods typically suffer from three core difficulties: temporal credit assignment with sparse rewards, lack of effective exploration, and brittle convergence properties that are extremely sensitive to hyperparameters. Collectively, these challenges severely limit the applicability of these approaches to real-world problems. Evolutionary Algorithms (EAs), a class of black box optimization techniques inspired by natural evolution, are well suited to address each of these three challenges. However, EAs typically suffer from high sample complexity and struggle to solve problems that require optimization of a large number of parameters. In this paper, we introduce Evolutionary Reinforcement Learning (ERL), a hybrid algorithm that leverages the population of an EA to provide diversified data to train an RL agent, and reinserts the RL agent into the EA population periodically to inject gradient information into the EA. ERL inherits EA's ability of temporal credit assignment with a fitness metric, effective exploration with a diverse set of policies, and stability of a population-based approach and complements it with off-policy DRL's ability to leverage gradients for higher sample efficiency and faster learning. Experiments in a range of challenging continuous control benchmarks demonstrate that ERL significantly outperforms prior DRL and EA methods.
△ Less
Submitted 27 October, 2018; v1 submitted 21 May, 2018;
originally announced May 2018.
-
FASHION: Fault-Aware Self-Healing Intelligent On-chip Network
Authors:
Pengju Ren,
Michel A. Kinsy,
Mengjiao Zhu,
Shreeya Khadka,
Mihailo Isakov,
Aniruddh Ramrakhyani,
Tushar Krishna,
Nanning Zheng
Abstract:
To avoid packet loss and deadlock scenarios that arise due to faults or power gating in multicore and many-core systems, the network-on-chip needs to possess resilient communication and load-balancing properties. In this work, we introduce the Fashion router, a self-monitoring and self-reconfiguring design that allows for the on-chip network to dynamically adapt to component failures. First, we in…
▽ More
To avoid packet loss and deadlock scenarios that arise due to faults or power gating in multicore and many-core systems, the network-on-chip needs to possess resilient communication and load-balancing properties. In this work, we introduce the Fashion router, a self-monitoring and self-reconfiguring design that allows for the on-chip network to dynamically adapt to component failures. First, we introduce a distributed intelligence unit, called Self-Awareness Module (SAM), which allows the router to detect permanent component failures and build a network connectivity map. Using local information, SAM adapts to faults, guarantees connectivity and deadlock-free routing inside the maximal connected subgraph and keeps routing tables up-to-date. Next, to reconfigure network links or virtual channels around faulty/power-gated components, we add bidirectional link and unified virtual channel structure features to the Fashion router. This version of the router, named Ex-Fashion, further mitigates the negative system performance impacts, leads to larger maximal connected subgraph and sustains a relatively high degree of fault-tolerance. To support the router, we develop a fault diagnosis and recovery algorithm executed by the Built-In Self-Test, self-monitoring, and self-reconfiguration units at runtime to provide fault-tolerant system functionalities. The Fashion router places no restriction on topology, position or number of faults. It drops 54.3-55.4% fewer nodes for same number of faults (between 30 and 60 faults) in an 8x8 2D-mesh over other state-of-the-art solutions. It is scalable and efficient. The area overheads are 2.311% and 2.659% when implemented in 8x8 and 16x16 2D-meshes using the TSMC 65nm library at 1.38GHz clock frequency.
△ Less
Submitted 8 February, 2017;
originally announced February 2017.
-
Synchronization dynamics on the picosecond timescale in coupled Josephson junction neurons
Authors:
Ken Segall,
Matthew LeGro,
Steven Kaplan,
Oleksiy Svitelskiy,
Shreeya Khadka,
Patrick Crotty,
Daniel Schult
Abstract:
Conventional digital computation is rapidly approaching physical limits for speed and energy dissipation. Here we fabricate and test a simple neuromorphic circuit that models neuronal somas, axons and synapses with superconducting Josephson junctions. The circuit models two mutually coupled excitatory neurons. In some regions of parameter space the neurons are desynchronized. In others, the Joseph…
▽ More
Conventional digital computation is rapidly approaching physical limits for speed and energy dissipation. Here we fabricate and test a simple neuromorphic circuit that models neuronal somas, axons and synapses with superconducting Josephson junctions. The circuit models two mutually coupled excitatory neurons. In some regions of parameter space the neurons are desynchronized. In others, the Josephson neurons synchronize in one of two states, in-phase or anti-phase. An experimental alteration of the delay and strength of the connecting synapses can toggle the system back and forth in a phase-flip bifurcation. Firing synchronization states are calculated >70,000 times faster than conventional digital approaches. With their speed and low energy dissipation (10-17 Joules/spike), this set of proof-of- concept experiments establishes Josephson junction neurons as a viable approach for improvements in neuronal computation as well as applications in neuromorphic computing.
△ Less
Submitted 8 February, 2017; v1 submitted 16 August, 2016;
originally announced August 2016.
-
Performance Analysis of Hybrid Forecasting Model In Stock Market Forecasting
Authors:
Mahesh S. Khadka,
K. M. George,
N. Park,
J. B. Kim
Abstract:
This paper presents performance analysis of hybrid model comprise of concordance and Genetic Programming (GP) to forecast financial market with some existing models. This scheme can be used for in depth analysis of stock market. Different measures of concordances such as Kendalls Tau, Ginis Mean Difference, Spearmans Rho, and weak interpretation of concordance are used to search for the pattern in…
▽ More
This paper presents performance analysis of hybrid model comprise of concordance and Genetic Programming (GP) to forecast financial market with some existing models. This scheme can be used for in depth analysis of stock market. Different measures of concordances such as Kendalls Tau, Ginis Mean Difference, Spearmans Rho, and weak interpretation of concordance are used to search for the pattern in past that look similar to present. Genetic Programming is then used to match the past trend to present trend as close as possible. Then Genetic Program estimates what will happen next based on what had happened next. The concept is validated using financial time series data (S&P 500 and NASDAQ indices) as sample data sets. The forecasted result is then compared with standard ARIMA model and other model to analyse its performance.
△ Less
Submitted 15 May, 2013; v1 submitted 20 September, 2012;
originally announced September 2012.