-
U-ARE-ME: Uncertainty-Aware Rotation Estimation in Manhattan Environments
Authors:
Aalok Patwardhan,
Callum Rhodes,
Gwangbin Bae,
Andrew J. Davison
Abstract:
Camera rotation estimation from a single image is a challenging task, often requiring depth data and/or camera intrinsics, which are generally not available for in-the-wild videos. Although external sensors such as inertial measurement units (IMUs) can help, they often suffer from drift and are not applicable in non-inertial reference frames. We present U-ARE-ME, an algorithm that estimates camera…
▽ More
Camera rotation estimation from a single image is a challenging task, often requiring depth data and/or camera intrinsics, which are generally not available for in-the-wild videos. Although external sensors such as inertial measurement units (IMUs) can help, they often suffer from drift and are not applicable in non-inertial reference frames. We present U-ARE-ME, an algorithm that estimates camera rotation along with uncertainty from uncalibrated RGB images. Using a Manhattan World assumption, our method leverages the per-pixel geometric priors encoded in single-image surface normal predictions and performs optimisation over the SO(3) manifold. Given a sequence of images, we can use the per-frame rotation estimates and their uncertainty to perform multi-frame optimisation, achieving robustness and temporal consistency. Our experiments demonstrate that U-ARE-ME performs comparably to RGB-D methods and is more robust than sparse feature-based SLAM methods. We encourage the reader to view the accompanying video at https://callum-rhodes.github.io/U-ARE-ME for a visual overview of our method.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
A Distributed Multi-Robot Framework for Exploration, Information Acquisition and Consensus
Authors:
Aalok Patwardhan,
Andrew J. Davison
Abstract:
The distributed coordination of robot teams performing complex tasks is challenging to formulate. The different aspects of a complete task such as local planning for obstacle avoidance, global goal coordination and collaborative map** are often solved separately, when clearly each of these should influence the others for the most efficient behaviour. In this paper we use the example application…
▽ More
The distributed coordination of robot teams performing complex tasks is challenging to formulate. The different aspects of a complete task such as local planning for obstacle avoidance, global goal coordination and collaborative map** are often solved separately, when clearly each of these should influence the others for the most efficient behaviour. In this paper we use the example application of distributed information acquisition as a robot team explores a large space to show that we can formulate the whole problem as a single factor graph with multiple connected layers representing each aspect. We use Gaussian Belief Propagation (GBP) as the inference mechanism, which permits parallel, on-demand or asynchronous computation for efficiency when different aspects are more or less important. This is the first time that a distributed GBP multi-robot solver has been proven to enable intelligent collaborative behaviour rather than just guiding robots to individual, selfish goals. We encourage the reader to view our demos at https://aalpatya.github.io/gbpstack
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
Distributing Collaborative Multi-Robot Planning with Gaussian Belief Propagation
Authors:
Aalok Patwardhan,
Riku Murai,
Andrew J. Davison
Abstract:
Precise coordinated planning over a forward time window enables safe and highly efficient motion when many robots must work together in tight spaces, but this would normally require centralised control of all devices which is difficult to scale. We demonstrate GBP Planning, a new purely distributed technique based on Gaussian Belief Propagation for multi-robot planning problems, formulated by a ge…
▽ More
Precise coordinated planning over a forward time window enables safe and highly efficient motion when many robots must work together in tight spaces, but this would normally require centralised control of all devices which is difficult to scale. We demonstrate GBP Planning, a new purely distributed technique based on Gaussian Belief Propagation for multi-robot planning problems, formulated by a generic factor graph defining dynamics and collision constraints over a forward time window. In simulations, we show that our method allows high performance collaborative planning where robots are able to cross each other in busy, intricate scenarios. They maintain shorter, quicker and smoother trajectories than alternative distributed planning techniques even in cases of communication failure. We encourage the reader to view the accompanying video demonstration at https://youtu.be/8VSrEUjH610.
△ Less
Submitted 26 January, 2023; v1 submitted 22 March, 2022;
originally announced March 2022.
-
Multimodal Affect Analysis for Product Feedback Assessment
Authors:
Amol S Patwardhan,
Gerald M Knapp
Abstract:
Consumers often react expressively to products such as food samples, perfume, jewelry, sunglasses, and clothing accessories. This research discusses a multimodal affect recognition system developed to classify whether a consumer likes or dislikes a product tested at a counter or kiosk, by analyzing the consumer's facial expression, body posture, hand gestures, and voice after testing the product.…
▽ More
Consumers often react expressively to products such as food samples, perfume, jewelry, sunglasses, and clothing accessories. This research discusses a multimodal affect recognition system developed to classify whether a consumer likes or dislikes a product tested at a counter or kiosk, by analyzing the consumer's facial expression, body posture, hand gestures, and voice after testing the product. A depth-capable camera and microphone system - Kinect for Windows - is utilized. An emotion identification engine has been developed to analyze the images and voice to determine affective state of the customer. The image is segmented using skin color and adaptive threshold. Face, body and hands are detected using the Haar cascade classifier. Canny edges are identified and the lip, body and hand contours are extracted using spatial filtering. Edge count and orientation around the mouth, cheeks, eyes, shoulders, fingers and the location of the edges are used as features. Classification is done by an emotion template map** algorithm and training a classifier using support vector machines. The real-time performance, accuracy and feasibility for multimodal affect recognition in feedback assessment are evaluated.
△ Less
Submitted 7 May, 2017;
originally announced May 2017.
-
Structured Unit Testable Templated Code for Efficient Code Review Process
Authors:
Amol Patwardhan
Abstract:
Modern software development teams are distributed across onsite and off-shore locations. Each team has developers with varying experience levels and English communication skills. In such a diverse development environment it is important to maintain the software quality, coding standards, timely delivery of features and bug fixes. It is also important to reduce testing effort, minimize side effects…
▽ More
Modern software development teams are distributed across onsite and off-shore locations. Each team has developers with varying experience levels and English communication skills. In such a diverse development environment it is important to maintain the software quality, coding standards, timely delivery of features and bug fixes. It is also important to reduce testing effort, minimize side effects such as change in functionality, user experience or application performance. Code reviews are intended to control code quality. Unfortunately, many projects lack enforcement of processes and standards because of approaching deadlines, live production issues and lack of resource availability. This study examines a novel structured, unit testable templated code method to enforce code review standards with an intent to reduce coding effort, minimize revisions and eliminate functional and performance side effects on the system. The proposed method would also result in unit-testable code that can also be easily rolled back and increase team productivity. The baseline for traditional code review processes using metrics such as code review duration, bug regression rate, revision count was measured. These metrics were then compared with results from the proposed code review process that used structured unit testable templated code. The performance on 2 large enterprise level applications spanning over 2 years and 9 feature and maintenance release cycles was evaluated. The structured unit testable templated code method resulted in a decrease in total code review time, revision count and coding effort. It also decreased the number of live production issues caused by code churn or side effects of bug fix when compared to traditional code review process.
△ Less
Submitted 9 August, 2016;
originally announced October 2016.
-
Edge Based Grid Super-Imposition for Crowd Emotion Recognition
Authors:
Amol Patwardhan
Abstract:
Numerous automatic continuous emotion detection system studies have examined mostly use of videos and images containing individual person expressing emotions. This study examines the detection of spontaneous emotions in a group and crowd settings. Edge detection was used with a grid of lines superimposition to extract the features. The feature movement in terms of movement from the reference point…
▽ More
Numerous automatic continuous emotion detection system studies have examined mostly use of videos and images containing individual person expressing emotions. This study examines the detection of spontaneous emotions in a group and crowd settings. Edge detection was used with a grid of lines superimposition to extract the features. The feature movement in terms of movement from the reference point was used to track across sequences of images from the color channel. Additionally the video data capturing was done on spontaneous emotions invoked by watching sports events from group of participants. The method was view and occlusion independent and the results were not affected by presence of multiple people chaotically expressing various emotions. The edge thresholds of 0.2 and grid thresholds of 20 showed the best accuracy results. The overall accuracy of the group emotion classifier was 70.9%.
△ Less
Submitted 7 August, 2016;
originally announced October 2016.
-
Automated Prediction of Temporal Relations
Authors:
Amol S Patwardhan,
Jacob Badeaux,
Siavash,
Gerald M Knapp
Abstract:
Background: There has been growing research interest in automated answering of questions or generation of summary of free form text such as news article. In order to implement this task, the computer should be able to identify the sequence of events, duration of events, time at which event occurred and the relationship type between event pairs, time pairs or event-time pairs. Specific Problem: It…
▽ More
Background: There has been growing research interest in automated answering of questions or generation of summary of free form text such as news article. In order to implement this task, the computer should be able to identify the sequence of events, duration of events, time at which event occurred and the relationship type between event pairs, time pairs or event-time pairs. Specific Problem: It is important to accurately identify the relationship type between combinations of event and time before the temporal ordering of events can be defined. The machine learning approach taken in Mani et. al (2006) provides an accuracy of only 62.5 on the baseline data from TimeBank. The researchers used maximum entropy classifier in their methodology. TimeML uses the TLINK annotation to tag a relationship type between events and time. The time complexity is quadratic when it comes to tagging documents with TLINK using human annotation. This research proposes using decision tree and parsing to improve the relationship type tagging. This research attempts to solve the gaps in human annotation by automating the task of relationship type tagging in an attempt to improve the accuracy of event and time relationship in annotated documents. Scope information: The documents from the domain of news will be used. The tagging will be performed within the same document and not across documents. The relationship types will be identified only for a pair of event and time and not a chain of events. The research focuses on documents tagged using the TimeML specification which contains tags such as EVENT, TLINK, and TIMEX. Each tag has attributes such as identifier, relation, POS, time etc.
△ Less
Submitted 22 July, 2016;
originally announced July 2016.
-
Analysis of Software Delivery Process Shortcomings and Architectural Pitfalls
Authors:
Amol Patwardhan
Abstract:
This paper highlights the common pitfalls of overcomplicating the software architecture, development and delivery process by examining two enterprise level web application products built using Microsoft.Net framework. The aim of this paper is to identify, discuss and analyze architectural, development and deployment issues and learn lessons using real world examples from the chosen software produc…
▽ More
This paper highlights the common pitfalls of overcomplicating the software architecture, development and delivery process by examining two enterprise level web application products built using Microsoft.Net framework. The aim of this paper is to identify, discuss and analyze architectural, development and deployment issues and learn lessons using real world examples from the chosen software products as case studies.
△ Less
Submitted 9 July, 2016;
originally announced July 2016.
-
Augmenting Supervised Emotion Recognition with Rule-Based Decision Model
Authors:
Amol Patwardhan,
Gerald Knapp
Abstract:
The aim of this research is development of rule based decision model for emotion recognition. This research also proposes using the rules for augmenting inter-corporal recognition accuracy in multimodal systems that use supervised learning techniques. The classifiers for such learning based recognition systems are susceptible to over fitting and only perform well on intra-corporal data. To overcom…
▽ More
The aim of this research is development of rule based decision model for emotion recognition. This research also proposes using the rules for augmenting inter-corporal recognition accuracy in multimodal systems that use supervised learning techniques. The classifiers for such learning based recognition systems are susceptible to over fitting and only perform well on intra-corporal data. To overcome the limitation this research proposes using rule based model as an additional modality. The rules were developed using raw feature data from visual channel, based on human annotator agreement and existing studies that have attributed movement and postures to emotions. The outcome of the rule evaluations was combined during the decision phase of emotion recognition system. The results indicate rule based emotion recognition augment recognition accuracy of learning based systems and also provide better recognition rate across inter corpus emotion test data.
△ Less
Submitted 9 July, 2016;
originally announced July 2016.
-
Multimodal Affect Recognition using Kinect
Authors:
Amol Patwardhan,
Gerald Knapp
Abstract:
Affect (emotion) recognition has gained significant attention from researchers in the past decade. Emotion-aware computer systems and devices have many applications ranging from interactive robots, intelligent online tutor to emotion based navigation assistant. In this research data from multiple modalities such as face, head, hand, body and speech was utilized for affect recognition. The research…
▽ More
Affect (emotion) recognition has gained significant attention from researchers in the past decade. Emotion-aware computer systems and devices have many applications ranging from interactive robots, intelligent online tutor to emotion based navigation assistant. In this research data from multiple modalities such as face, head, hand, body and speech was utilized for affect recognition. The research used color and depth sensing device such as Kinect for facial feature extraction and tracking human body joints. Temporal features across multiple frames were used for affect recognition. Event driven decision level fusion was used to combine the results from each individual modality using majority voting to recognize the emotions. The study also implemented affect recognition by matching the features to the rule based emotion templates per modality. Experiments showed that multimodal affect recognition rates using combination of emotion templates and supervised learning were better compared to recognition rates based on supervised learning alone. Recognition rates obtained using temporal feature were higher compared to recognition rates obtained using position based features only.
△ Less
Submitted 9 July, 2016;
originally announced July 2016.
-
Embracing Agile methodology during DevOps Developer Internship Program
Authors:
Amol Patwardhan,
Jon Kidd,
Tiffany Urena,
Aishwarya Rajgopalan
Abstract:
The DevOps team adopted agile methodologies during the summer internship program as an initiative to move away from waterfall. The DevOps team implemented the Scrum software development strategy to create an internal data dictionary web application. This article reports on the transition process and lessons learned from the pilot program.
The DevOps team adopted agile methodologies during the summer internship program as an initiative to move away from waterfall. The DevOps team implemented the Scrum software development strategy to create an internal data dictionary web application. This article reports on the transition process and lessons learned from the pilot program.
△ Less
Submitted 7 July, 2016;
originally announced July 2016.
-
EmoFit: Affect Monitoring System for Sedentary Jobs
Authors:
Amol Patwardhan,
Gerald Knapp
Abstract:
Emotional and physical well-being at workplace is important for a positive work environment and higher productivity. Jobs such as software programming lead to a sedentary lifestyle and require high interaction with computers. Working at the same job for years can cause a feeling of intellectual stagnation and lack of drive. Many employees experience lack of motivation, mild to extreme depression d…
▽ More
Emotional and physical well-being at workplace is important for a positive work environment and higher productivity. Jobs such as software programming lead to a sedentary lifestyle and require high interaction with computers. Working at the same job for years can cause a feeling of intellectual stagnation and lack of drive. Many employees experience lack of motivation, mild to extreme depression due to reasons such as aversion towards job responsibilities and incompatibility with coworkers or boss. This research proposed an affect monitoring system EmoFit that would play the role of psychological and physical health trainer. The day to day computer activity and body language was analyzed to detect the physical and emotional well-being of the user. Keystrokes, activity interruptions, eye tracking, facial expressions, body posture and speech were monitored to gauge the users health. The system also provided activities such as at-desk exercise and stress relief game and motivational quotes in an attempt to promote users well-being. The experimental results and positive feedback from test subjects showed that EmoFit would help improve emotional and physical well-being at jobs that involve significant computer usage.
△ Less
Submitted 4 July, 2016;
originally announced July 2016.
-
Aggressive actions and anger detection from multiple modalities using Kinect
Authors:
Amol Patwardhan,
Gerald Knapp
Abstract:
Prison facilities, mental correctional institutions, sports bars and places of public protest are prone to sudden violence and conflicts. Surveillance systems play an important role in mitigation of hostile behavior and improvement of security by detecting such provocative and aggressive activities. This research proposed using automatic aggressive behavior and anger detection to improve the effec…
▽ More
Prison facilities, mental correctional institutions, sports bars and places of public protest are prone to sudden violence and conflicts. Surveillance systems play an important role in mitigation of hostile behavior and improvement of security by detecting such provocative and aggressive activities. This research proposed using automatic aggressive behavior and anger detection to improve the effectiveness of the surveillance systems. An emotion and aggression aware component will make the surveillance system highly responsive and capable of alerting the security guards in real time. This research proposed facial expression, head, hand and body movement and speech tracking for detecting anger and aggressive actions. Recognition was achieved using support vector machines and rule based features. The multimodal affect recognition precision rate for anger improved by 15.2% and recall rate improved by 11.7% when behavioral rule based features were used in aggressive action detection.
△ Less
Submitted 4 July, 2016;
originally announced July 2016.
-
Affect Intensity Estimation Using Multiple Modalities
Authors:
Amol Patwardhan,
Gerald Knapp
Abstract:
One of the challenges in affect recognition is accurate estimation of the emotion intensity level. This research proposes development of an affect intensity estimation model based on a weighted sum of classification confidence levels, displacement of feature points and speed of feature point motion. The parameters of the model were calculated from data captured using multiple modalities such as fa…
▽ More
One of the challenges in affect recognition is accurate estimation of the emotion intensity level. This research proposes development of an affect intensity estimation model based on a weighted sum of classification confidence levels, displacement of feature points and speed of feature point motion. The parameters of the model were calculated from data captured using multiple modalities such as face, body posture, hand movement and speech. A preliminary study was conducted to compare the accuracy of the model with the annotated intensity levels. An emotion intensity scale ranging from 0 to 1 along the arousal dimension in the emotion space was used. Results indicated speech and hand modality significantly contributed in improving accuracy in emotion intensity estimation using the proposed model.
△ Less
Submitted 4 July, 2016;
originally announced July 2016.
-
Self-Contained Cross-Cutting Pipeline Software Architecture
Authors:
Amol Patwardhan,
Rahul Patwardhan,
Sumalini Vartak
Abstract:
Layered software architecture contains several intra-layer and inter-layer dependencies. Each layer depends on shared components making it difficult to release a code change, bug fix or feature without exhaustive testing and having to build the entire software code base. This paper proposed self-contained, cross-cutting pipeline architecture (SCPA) that is independent of existing layers. We chose…
▽ More
Layered software architecture contains several intra-layer and inter-layer dependencies. Each layer depends on shared components making it difficult to release a code change, bug fix or feature without exhaustive testing and having to build the entire software code base. This paper proposed self-contained, cross-cutting pipeline architecture (SCPA) that is independent of existing layers. We chose 2 open source projects and 3 internal intern projects that used n-tier architecture and applied the SCPA to release subsequent feature additions and any bug fixes. The SCPA decreased the release time by 42.99%. The lines of delivered code (LOC), increased by 22.58%. The number of defects found in existing functionality decreased by 85.54%. The SCPA also provided ability to roll back or switch off the new feature quickly. SCPA proved a suitable architecture for agile software development and continuous deployment.
△ Less
Submitted 25 June, 2016;
originally announced June 2016.
-
XML Entity Architecture for Efficient Software Integration
Authors:
Amol Patwardhan,
Rahul Patwardhan
Abstract:
This paper proposed xml entities based architectural implementation to improve integration between multiple third party vendor software systems with incompatible xml schema. The xml entity architecture implementation showed that the lines of code change required for map** the schema between in house software and three other vendor schema, decreased by 5.2%, indicating an improvement in quality.…
▽ More
This paper proposed xml entities based architectural implementation to improve integration between multiple third party vendor software systems with incompatible xml schema. The xml entity architecture implementation showed that the lines of code change required for map** the schema between in house software and three other vendor schema, decreased by 5.2%, indicating an improvement in quality. The schema map** development time decreased by 3.8% and overall release time decreased by 5.3%, indicating an improvement in productivity. The proposed technique proved that using xml entities and XSLT transforms is more efficient in terms of coding effort and deployment complexity when compared to map** the schema using object oriented scripting language such as C#.
△ Less
Submitted 25 June, 2016;
originally announced June 2016.