-
NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing
Authors:
Ting-Hsuan Chen,
Jiewen Chan,
Hau-Shiang Shiu,
Shih-Han Yen,
Chang-Han Yeh,
Yu-Lun Liu
Abstract:
We propose a video editing framework, NaRCan, which integrates a hybrid deformation field and diffusion prior to generate high-quality natural canonical images to represent the input video. Our approach utilizes homography to model global motion and employs multi-layer perceptrons (MLPs) to capture local residual deformations, enhancing the model's ability to handle complex video dynamics. By intr…
▽ More
We propose a video editing framework, NaRCan, which integrates a hybrid deformation field and diffusion prior to generate high-quality natural canonical images to represent the input video. Our approach utilizes homography to model global motion and employs multi-layer perceptrons (MLPs) to capture local residual deformations, enhancing the model's ability to handle complex video dynamics. By introducing a diffusion prior from the early stages of training, our model ensures that the generated images retain a high-quality natural appearance, making the produced canonical images suitable for various downstream tasks in video editing, a capability not achieved by current canonical-based methods. Furthermore, we incorporate low-rank adaptation (LoRA) fine-tuning and introduce a noise and diffusion prior update scheduling technique that accelerates the training process by 14 times. Extensive experimental results show that our method outperforms existing approaches in various video editing tasks and produces coherent and high-quality edited video sequences. See our project page for video results at https://koi953215.github.io/NaRCan_page/.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
A nonlinear hidden layer enables actor-critic agents to learn multiple paired association navigation
Authors:
M Ganesh Kumar,
Cheston Tan,
Camilo Libedinsky,
Shih-Cheng Yen,
Andrew Yong-Yi Tan
Abstract:
Navigation to multiple cued reward locations has been increasingly used to study rodent learning. Though deep reinforcement learning agents have been shown to be able to learn the task, they are not biologically plausible. Biologically plausible classic actor-critic agents have been shown to learn to navigate to single reward locations, but which biologically plausible agents are able to learn mul…
▽ More
Navigation to multiple cued reward locations has been increasingly used to study rodent learning. Though deep reinforcement learning agents have been shown to be able to learn the task, they are not biologically plausible. Biologically plausible classic actor-critic agents have been shown to learn to navigate to single reward locations, but which biologically plausible agents are able to learn multiple cue-reward location tasks has remained unclear. In this computational study, we show versions of classic agents that learn to navigate to a single reward location, and adapt to reward location displacement, but are not able to learn multiple paired association navigation. The limitation is overcome by an agent in which place cell and cue information are first processed by a feedforward nonlinear hidden layer with synapses to the actor and critic subject to temporal difference error-modulated plasticity. Faster learning is obtained when the feedforward layer is replaced by a recurrent reservoir network.
△ Less
Submitted 15 July, 2021; v1 submitted 25 June, 2021;
originally announced June 2021.
-
One-shot learning of paired association navigation with biologically plausible schemas
Authors:
M Ganesh Kumar,
Cheston Tan,
Camilo Libedinsky,
Shih-Cheng Yen,
Andrew Yong-Yi Tan
Abstract:
Schemas are knowledge structures that can enable rapid learning. Rodent one-shot learning in a multiple paired association navigation task has been postulated to be schema-dependent. But how schemas, conceptualized at Marr's computational level, correspond with neural implementations remains poorly understood, and a biologically plausible computational model of the rodent learning has not been dem…
▽ More
Schemas are knowledge structures that can enable rapid learning. Rodent one-shot learning in a multiple paired association navigation task has been postulated to be schema-dependent. But how schemas, conceptualized at Marr's computational level, correspond with neural implementations remains poorly understood, and a biologically plausible computational model of the rodent learning has not been demonstrated. Here, we compose such an agent from schemas with biologically plausible neural implementations. The agent contains an associative memory that can form one-shot associations between sensory cues and goal coordinates, implemented with a feedforward layer or a reservoir of recurrently connected neurons whose plastic output weights are governed by a novel 4-factor reward-modulated Exploratory Hebbian (EH) rule. Adding an actor-critic allows the agent to succeed even if an obstacle prevents direct heading. With the addition of working memory, the rodent behavior is replicated. Temporal-difference learning of a working memory gating mechanism enables one-shot learning despite distractors.
△ Less
Submitted 27 August, 2023; v1 submitted 7 June, 2021;
originally announced June 2021.
-
A RoboStack Tutorial: Using the Robot Operating System Alongside the Conda and Jupyter Data Science Ecosystems
Authors:
Tobias Fischer,
Wolf Vollprecht,
Silvio Traversaro,
Sean Yen,
Carlos Herrero,
Michael Milford
Abstract:
We argue that it is beneficial to tightly couple the widely-used Robot Operating System with Conda, a cross-platform, language-agnostic package manager, and Jupyter, a web-based interactive computational environment affording scientific computing. We provide new ROS packages for Conda, enabling the installation of ROS alongside data-science and machine-learning packages with ease. Multiple ROS ver…
▽ More
We argue that it is beneficial to tightly couple the widely-used Robot Operating System with Conda, a cross-platform, language-agnostic package manager, and Jupyter, a web-based interactive computational environment affording scientific computing. We provide new ROS packages for Conda, enabling the installation of ROS alongside data-science and machine-learning packages with ease. Multiple ROS versions (currently ROS1 Melodic and Noetic, as well as ROS2 Foxy and Galactic) can run simultaneously on one machine, with pre-compiled binaries available for Linux, Windows and OSX, and the ARM architecture (e.g. the Raspberry Pi and the new Apple Silicon). To deal with the large size of the ROS ecosystem, we significantly improved the speed of the Conda solver and build system by rewriting the crucial parts in C++. We further contribute a collection of JupyterLab extensions for ROS, including plugins for live plotting, debugging and robot control, as well as tight integration with Zethus, an RViz like visualization tool. Taken together, RoboStack combines the best of the data-science and robotics worlds to help researchers and developers to build custom solutions for their academic and industrial projects.
△ Less
Submitted 15 December, 2021; v1 submitted 26 April, 2021;
originally announced April 2021.
-
Polygames: Improved Zero Learning
Authors:
Tristan Cazenave,
Yen-Chi Chen,
Guan-Wei Chen,
Shi-Yu Chen,
Xian-Dong Chiu,
Julien Dehos,
Maria Elsa,
Qucheng Gong,
Hengyuan Hu,
Vasil Khalidov,
Cheng-Ling Li,
Hsin-I Lin,
Yu-** Lin,
Xavier Martinet,
Vegard Mella,
Jeremy Rapin,
Baptiste Roziere,
Gabriel Synnaeve,
Fabien Teytaud,
Olivier Teytaud,
Shi-Cheng Ye,
Yi-Jun Ye,
Shi-Jim Yen,
Sergey Zagoruyko
Abstract:
Since DeepMind's AlphaZero, Zero learning quickly became the state-of-the-art method for many board games. It can be improved using a fully convolutional structure (no fully connected layer). Using such an architecture plus global pooling, we can create bots independent of the board size. The training can be made more robust by kee** track of the best checkpoints during the training and by train…
▽ More
Since DeepMind's AlphaZero, Zero learning quickly became the state-of-the-art method for many board games. It can be improved using a fully convolutional structure (no fully connected layer). Using such an architecture plus global pooling, we can create bots independent of the board size. The training can be made more robust by kee** track of the best checkpoints during the training and by training against them. Using these features, we release Polygames, our framework for Zero learning, with its library of games and its checkpoints. We won against strong humans at the game of Hex in 19x19, which was often said to be untractable for zero learning; and in Havannah. We also won several first places at the TAAI competitions.
△ Less
Submitted 27 January, 2020;
originally announced January 2020.
-
Integration of Static and Dynamic Analysis for Malware Family Classification with Composite Neural Network
Authors:
Yao Saint Yen,
Zhe Wei Chen,
Ying Ren Guo,
Meng Chang Chen
Abstract:
Deep learning has been used in the research of malware analysis. Most classification methods use either static analysis features or dynamic analysis features for malware family classification, and rarely combine them as classification features and also no extra effort is spent integrating the two types of features. In this paper, we combine static and dynamic analysis features with deep neural net…
▽ More
Deep learning has been used in the research of malware analysis. Most classification methods use either static analysis features or dynamic analysis features for malware family classification, and rarely combine them as classification features and also no extra effort is spent integrating the two types of features. In this paper, we combine static and dynamic analysis features with deep neural networks for Windows malware classification. We develop several methods to generate static and dynamic analysis features to classify malware in different ways. Given these features, we conduct experiments with composite neural network, showing that the proposed approach performs best with an accuracy of 83.17% on a total of 80 malware families with 4519 malware samples. Additionally, we show that using integrated features for malware family classification outperforms using static features or dynamic features alone. We show how static and dynamic features complement each other for malware classification.
△ Less
Submitted 24 December, 2019;
originally announced December 2019.
-
Self-Expressive Subspace Clustering to Recognize Motion Dynamics of a Multi-Joint Coordination for Chronic Ankle Instability
Authors:
Shaodi Qian,
Sheng-Che Yen,
Eric Folmar,
Chun-An Chou
Abstract:
Ankle sprains and instability are major public health concerns. Up to 70% of individuals do not fully recover from a single ankle sprain and eventually develop chronic ankle instability (CAI). The diagnosis of CAI has been mainly based on self-report rather than objective biomechanical measures. The goal of this study is to quantitatively recognize the motion pattern of a multi-joint coordination…
▽ More
Ankle sprains and instability are major public health concerns. Up to 70% of individuals do not fully recover from a single ankle sprain and eventually develop chronic ankle instability (CAI). The diagnosis of CAI has been mainly based on self-report rather than objective biomechanical measures. The goal of this study is to quantitatively recognize the motion pattern of a multi-joint coordination using biosensor data from bilateral hip, knee, and ankle joints, and further distinguish between CAI and healthy cohorts. We propose an analytic framework, where a nonlinear subspace clustering method is developed to learn the motion dynamic patterns from an inter-connected network of multiply joints. A support vector machine model is trained with a leave-one-subject-out cross validation to validate the learned measures compared to traditional statistical measures. The computational results showed >70% classification accuracy on average based on the dataset of 48 subjects (25 with CAI and 23 normal controls) examined in our designed experiment. It is found that CAI can be observed from other joints (e.g., hips) significantly, which reflects the fact that there are interactions in the multi-joint coordination system. The developed method presents a potential to support the decisions with motion patterns during diagnosis, treatment, rehabilitation of gait abnormality caused by physical injury (e.g., ankle sprains in this study) or even central nervous system disorders.
△ Less
Submitted 25 September, 2019; v1 submitted 6 January, 2019;
originally announced January 2019.
-
A Network-based Multimodal Data Fusion Approach for Characterizing Dynamic Multimodal Physiological Patterns
Authors:
Miaolin Fan,
Chun-An Chou,
Sheng-Che Yen,
Yingzi Lin
Abstract:
Characterizing the dynamic interactive patterns of complex systems helps gain in-depth understanding of how components interrelate with each other while performing certain functions as a whole. In this study, we present a novel multimodal data fusion approach to construct a complex network, which models the interactions of biological subsystems in the human body under emotional states through phys…
▽ More
Characterizing the dynamic interactive patterns of complex systems helps gain in-depth understanding of how components interrelate with each other while performing certain functions as a whole. In this study, we present a novel multimodal data fusion approach to construct a complex network, which models the interactions of biological subsystems in the human body under emotional states through physiological responses. Joint recurrence plot and temporal network metrics are employed to integrate the multimodal information at the signal level. A benchmark public dataset of is used for evaluating our model.
△ Less
Submitted 3 January, 2019;
originally announced January 2019.
-
Egocentric Spatial Memory
Authors:
Mengmi Zhang,
Keng Teck Ma,
Shih-Cheng Yen,
Joo Hwee Lim,
Qi Zhao,
Jiashi Feng
Abstract:
Egocentric spatial memory (ESM) defines a memory system with encoding, storing, recognizing and recalling the spatial information about the environment from an egocentric perspective. We introduce an integrated deep neural network architecture for modeling ESM. It learns to estimate the occupancy state of the world and progressively construct top-down 2D global maps from egocentric views in a spat…
▽ More
Egocentric spatial memory (ESM) defines a memory system with encoding, storing, recognizing and recalling the spatial information about the environment from an egocentric perspective. We introduce an integrated deep neural network architecture for modeling ESM. It learns to estimate the occupancy state of the world and progressively construct top-down 2D global maps from egocentric views in a spatially extended environment. During the exploration, our proposed ESM model updates belief of the global map based on local observations using a recurrent neural network. It also augments the local map** with a novel external memory to encode and store latent representations of the visited places over long-term exploration in large environments which enables agents to perform place recognition and hence, loop closure. Our proposed ESM network contributes in the following aspects: (1) without feature engineering, our model predicts free space based on egocentric views efficiently in an end-to-end manner; (2) different from other deep learning-based map** system, ESMN deals with continuous actions and states which is vitally important for robotic control in real applications. In the experiments, we demonstrate its accurate and robust global map** capacities in 3D virtual mazes and realistic indoor environments by comparing with several competitive baselines.
△ Less
Submitted 31 July, 2018;
originally announced July 2018.
-
Human vs. Computer Go: Review and Prospect
Authors:
Chang-Shing Lee,
Mei-Hui Wang,
Shi-Jim Yen,
Ting-Han Wei,
I-Chen Wu,
**-Chiang Chou,
Chun-Hsun Chou,
Ming-Wan Wang,
Tai-Hsiung Yang
Abstract:
The Google DeepMind challenge match in March 2016 was a historic achievement for computer Go development. This article discusses the development of computational intelligence (CI) and its relative strength in comparison with human intelligence for the game of Go. We first summarize the milestones achieved for computer Go from 1998 to 2016. Then, the computer Go programs that have participated in p…
▽ More
The Google DeepMind challenge match in March 2016 was a historic achievement for computer Go development. This article discusses the development of computational intelligence (CI) and its relative strength in comparison with human intelligence for the game of Go. We first summarize the milestones achieved for computer Go from 1998 to 2016. Then, the computer Go programs that have participated in previous IEEE CIS competitions as well as methods and techniques used in AlphaGo are briefly introduced. Commentaries from three high-level professional Go players on the five AlphaGo versus Lee Sedol games are also included. We conclude that AlphaGo beating Lee Sedol is a huge achievement in artificial intelligence (AI) based largely on CI methods. In the future, powerful computer Go programs such as AlphaGo are expected to be instrumental in promoting Go education and AI real-world applications.
△ Less
Submitted 7 June, 2016;
originally announced June 2016.
-
Depth, balancing, and limits of the Elo model
Authors:
Marie-Liesse Cauwet,
Olivier Teytaud,
Hua-Min Liang,
Shi-Jim Yen,
Hung-Hsuan Lin,
I-Chen Wu,
Tristan Cazenave,
Abdallah Saffidine
Abstract:
-Much work has been devoted to the computational complexity of games. However, they are not necessarily relevant for estimating the complexity in human terms. Therefore, human-centered measures have been proposed, e.g. the depth. This paper discusses the depth of various games, extends it to a continuous measure. We provide new depth results and present tool (given-first-move, pie rule, size exten…
▽ More
-Much work has been devoted to the computational complexity of games. However, they are not necessarily relevant for estimating the complexity in human terms. Therefore, human-centered measures have been proposed, e.g. the depth. This paper discusses the depth of various games, extends it to a continuous measure. We provide new depth results and present tool (given-first-move, pie rule, size extension) for increasing it. We also use these measures for analyzing games and opening moves in Y, NoGo, Killall Go, and the effect of pie rules.
△ Less
Submitted 6 November, 2015;
originally announced November 2015.
-
Formalization of the data flow diagram rules for consistency check
Authors:
Rosziati Ibrahim,
Siow Yen yen
Abstract:
In system development life cycle (SDLC), a system model can be developed using Data Flow Diagram (DFD). DFD is graphical diagrams for specifying, constructing and visualizing the model of a system. DFD is used in defining the requirements in a graphical view. In this paper, we focus on DFD and its rules for drawing and defining the diagrams. We then formalize these rules and develop the tool based…
▽ More
In system development life cycle (SDLC), a system model can be developed using Data Flow Diagram (DFD). DFD is graphical diagrams for specifying, constructing and visualizing the model of a system. DFD is used in defining the requirements in a graphical view. In this paper, we focus on DFD and its rules for drawing and defining the diagrams. We then formalize these rules and develop the tool based on the formalized rules. The formalized rules for consistency check between the diagrams are used in develo** the tool. This is to ensure the syntax for drawing the diagrams is correct and strictly followed. The tool automates the process of manual consistency check between data flow diagrams.
△ Less
Submitted 1 November, 2010;
originally announced November 2010.