-
Zero-shot detection of buildings in mobile LiDAR using Language Vision Model
Authors:
June Moh Goo,
Zichao Zeng,
Jan Boehm
Abstract:
Recent advances have demonstrated that Language Vision Models (LVMs) surpass the existing State-of-the-Art (SOTA) in two-dimensional (2D) computer vision tasks, motivating attempts to apply LVMs to three-dimensional (3D) data. While LVMs are efficient and effective in addressing various downstream 2D vision tasks without training, they face significant challenges when it comes to point clouds, a r…
▽ More
Recent advances have demonstrated that Language Vision Models (LVMs) surpass the existing State-of-the-Art (SOTA) in two-dimensional (2D) computer vision tasks, motivating attempts to apply LVMs to three-dimensional (3D) data. While LVMs are efficient and effective in addressing various downstream 2D vision tasks without training, they face significant challenges when it comes to point clouds, a representative format for representing 3D data. It is more difficult to extract features from 3D data and there are challenges due to large data sizes and the cost of the collection and labelling, resulting in a notably limited availability of datasets. Moreover, constructing LVMs for point clouds is even more challenging due to the requirements for large amounts of data and training time. To address these issues, our research aims to 1) apply the Grounded SAM through Spherical Projection to transfer 3D to 2D, and 2) experiment with synthetic data to evaluate its effectiveness in bridging the gap between synthetic and real-world data domains. Our approach exhibited high performance with an accuracy of 0.96, an IoU of 0.85, precision of 0.92, recall of 0.91, and an F1 score of 0.92, confirming its potential. However, challenges such as occlusion problems and pixel-level overlaps of multi-label points during spherical image generation remain to be addressed in future studies.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Zero-shot Building Age Classification from Facade Image Using GPT-4
Authors:
Zichao Zeng,
June Moh Goo,
Xinglei Wang,
Bin Chi,
Meihui Wang,
Jan Boehm
Abstract:
A building's age of construction is crucial for supporting many geospatial applications. Much current research focuses on estimating building age from facade images using deep learning. However, building an accurate deep learning model requires a considerable amount of labelled training data, and the trained models often have geographical constraints. Recently, large pre-trained vision language mo…
▽ More
A building's age of construction is crucial for supporting many geospatial applications. Much current research focuses on estimating building age from facade images using deep learning. However, building an accurate deep learning model requires a considerable amount of labelled training data, and the trained models often have geographical constraints. Recently, large pre-trained vision language models (VLMs) such as GPT-4 Vision, which demonstrate significant generalisation capabilities, have emerged as potential training-free tools for dealing with specific vision tasks, but their applicability and reliability for building information remain unexplored. In this study, a zero-shot building age classifier for facade images is developed using prompts that include logical instructions. Taking London as a test case, we introduce a new dataset, FI-London, comprising facade images and building age epochs. Although the training-free classifier achieved a modest accuracy of 39.69%, the mean absolute error of 0.85 decades indicates that the model can predict building age epochs successfully albeit with a small bias. The ensuing discussion reveals that the classifier struggles to predict the age of very old buildings and is challenged by fine-grained predictions within 2 decades. Overall, the classifier utilising GPT-4 Vision is capable of predicting the rough age epoch of a building from a single facade image without any training.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Accurate, scalable, and efficient Bayesian Optimal Experimental Design with derivative-informed neural operators
Authors:
**woo Go,
Peng Chen
Abstract:
We consider optimal experimental design (OED) problems in selecting the most informative observation sensors to estimate model parameters in a Bayesian framework. Such problems are computationally prohibitive when the parameter-to-observable (PtO) map is expensive to evaluate, the parameters are high-dimensional, and the optimization for sensor selection is combinatorial and high-dimensional. To a…
▽ More
We consider optimal experimental design (OED) problems in selecting the most informative observation sensors to estimate model parameters in a Bayesian framework. Such problems are computationally prohibitive when the parameter-to-observable (PtO) map is expensive to evaluate, the parameters are high-dimensional, and the optimization for sensor selection is combinatorial and high-dimensional. To address these challenges, we develop an accurate, scalable, and efficient computational framework based on derivative-informed neural operators (DINOs). The derivative of the PtO map is essential for accurate evaluation of the optimality criteria of OED in our consideration. We take the key advantage of DINOs, a class of neural operators trained with derivative information, to achieve high approximate accuracy of not only the PtO map but also, more importantly, its derivative. Moreover, we develop scalable and efficient computation of the optimality criteria based on DINOs and propose a modified swap** greedy algorithm for its optimization. We demonstrate that the proposed method is scalable to preserve the accuracy for increasing parameter dimensions and achieves high computational efficiency, with an over 1000x speedup accounting for both offline construction and online evaluation costs, compared to high-fidelity Bayesian OED solutions for a three-dimensional nonlinear convection-diffusion-reaction example with tens of thousands of parameters.
△ Less
Submitted 27 March, 2024; v1 submitted 22 December, 2023;
originally announced December 2023.
-
Socio-Economic Deprivation Analysis: Diffusion Maps
Authors:
June Moh Goo
Abstract:
This report proposes a model to predict the location of the most deprived areas in a city using data from the census. A census data is very high dimensional and needs to be simplified. We use a novel algorithm to reduce dimensionality and find patterns: The diffusion map. Features are defined by eigenvectors of the Laplacian matrix that defines the diffusion map. Eigenvectors corresponding to the…
▽ More
This report proposes a model to predict the location of the most deprived areas in a city using data from the census. A census data is very high dimensional and needs to be simplified. We use a novel algorithm to reduce dimensionality and find patterns: The diffusion map. Features are defined by eigenvectors of the Laplacian matrix that defines the diffusion map. Eigenvectors corresponding to the smallest eigenvalues indicate specific population features. Previous work has found qualitatively that the second most important dimension for describing the census data in Bristol is linked to deprivation. In this report, we analyse how good this dimension is as a model for predicting deprivation by comparing with the recognised measures. The Pearson correlation coefficient was found to be over 0.7. The top 10 per cent of deprived areas in the UK which also locate in Bristol are extracted to test the accuracy of the model. There are 52 most deprived areas, and 38 areas are correctly identified by comparing to the model. The influence of scores of IMD domains that do not correlate with the models, Eigenvector 2 entries of non-deprived OAs and orthogonality of Eigenvectors cause the model to fail the prediction of 14 deprived areas.
However, overall, the model shows a high performance to predict the future deprivation of overall areas where the project considers. This project is expected to support the government to allocate resources and funding.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Spatial Bias for Attention-free Non-local Neural Networks
Authors:
Junhyung Go,
Jongbin Ryu
Abstract:
In this paper, we introduce the spatial bias to learn global knowledge without self-attention in convolutional neural networks. Owing to the limited receptive field, conventional convolutional neural networks suffer from learning long-range dependencies. Non-local neural networks have struggled to learn global knowledge, but unavoidably have too heavy a network design due to the self-attention ope…
▽ More
In this paper, we introduce the spatial bias to learn global knowledge without self-attention in convolutional neural networks. Owing to the limited receptive field, conventional convolutional neural networks suffer from learning long-range dependencies. Non-local neural networks have struggled to learn global knowledge, but unavoidably have too heavy a network design due to the self-attention operation. Therefore, we propose a fast and lightweight spatial bias that efficiently encodes global knowledge without self-attention on convolutional neural networks. Spatial bias is stacked on the feature map and convolved together to adjust the spatial structure of the convolutional features. Therefore, we learn the global knowledge on the convolution layer directly with very few additional resources. Our method is very fast and lightweight due to the attention-free non-local method while improving the performance of neural networks considerably. Compared to non-local neural networks, the spatial bias use about 10 times fewer parameters while achieving comparable performance with 1.6 ~ 3.3 times more throughput on a very little budget. Furthermore, the spatial bias can be used with conventional non-local neural networks to further improve the performance of the backbone model. We show that the spatial bias achieves competitive performance that improves the classification accuracy by +0.79% and +1.5% on ImageNet-1K and cifar100 datasets. Additionally, we validate our method on the MS-COCO and ADE20K datasets for downstream tasks involving object detection and semantic segmentation.
△ Less
Submitted 24 February, 2023;
originally announced February 2023.
-
Suppression of Spontaneous Defect Formation in Inhomogeneous Bose Gases
Authors:
Myeonghyeon Kim,
Tenzin Rabga,
Yangheon Lee,
Junhong Goo,
Dalmin Bae,
Yong-il Shin
Abstract:
In phase transition dynamics involving symmetry breaking, topological defects can be spontaneously created but it is suppressed in a spatially inhomogeneous system due to the spreading of the ordered phase information. We demonstrate the defect suppression effect in a trapped atomic Bose gas which is quenched into a superfluid phase. The spatial distribution of created defects is measured for vari…
▽ More
In phase transition dynamics involving symmetry breaking, topological defects can be spontaneously created but it is suppressed in a spatially inhomogeneous system due to the spreading of the ordered phase information. We demonstrate the defect suppression effect in a trapped atomic Bose gas which is quenched into a superfluid phase. The spatial distribution of created defects is measured for various quench times and it is shown that for slower quenches, the spontaneous defect production is relatively more suppressed in the sample's outer region with higher atomic density gradient. The power-law scaling of the local defect density with the quench time is enhanced in the outer region, which is consistent with the Kibble-Zurek mechanism including the causality effect due to the spatial inhomogeneity of the system. This work opens an avenue in the study of nonequilibrium phase transition dynamics using the defect position information.
△ Less
Submitted 3 August, 2022;
originally announced August 2022.
-
FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Authors:
Yeonghyeon Lee,
Kangwook Jang,
Jahyun Goo,
Youngmoon Jung,
Hoirin Kim
Abstract:
Large-scale speech self-supervised learning (SSL) has emerged to the main field of speech processing, however, the problem of computational cost arising from its vast size makes a high entry barrier to academia. In addition, existing distillation techniques of speech SSL models compress the model by reducing layers, which induces performance degradation in linguistic pattern recognition tasks such…
▽ More
Large-scale speech self-supervised learning (SSL) has emerged to the main field of speech processing, however, the problem of computational cost arising from its vast size makes a high entry barrier to academia. In addition, existing distillation techniques of speech SSL models compress the model by reducing layers, which induces performance degradation in linguistic pattern recognition tasks such as phoneme recognition (PR). In this paper, we propose FitHuBERT, which makes thinner in dimension throughout almost all model components and deeper in layer compared to prior speech SSL distillation works. Moreover, we employ a time-reduction layer to speed up inference time and propose a method of hint-based distillation for less performance degradation. Our method reduces the model to 23.8% in size and 35.9% in inference time compared to HuBERT. Also, we achieve 12.1% word error rate and 13.3% phoneme error rate on the SUPERB benchmark which is superior than prior work.
△ Less
Submitted 1 July, 2022;
originally announced July 2022.
-
Robust Expected Information Gain for Optimal Bayesian Experimental Design Using Ambiguity Sets
Authors:
**woo Go,
Tobin Isaac
Abstract:
The ranking of experiments by expected information gain (EIG) in Bayesian experimental design is sensitive to changes in the model's prior distribution, and the approximation of EIG yielded by sampling will have errors similar to the use of a perturbed prior. We define and analyze \emph{robust expected information gain} (REIG), a modification of the objective in EIG maximization by minimizing an a…
▽ More
The ranking of experiments by expected information gain (EIG) in Bayesian experimental design is sensitive to changes in the model's prior distribution, and the approximation of EIG yielded by sampling will have errors similar to the use of a perturbed prior. We define and analyze \emph{robust expected information gain} (REIG), a modification of the objective in EIG maximization by minimizing an affine relaxation of EIG over an ambiguity set of distributions that are close to the original prior in KL-divergence. We show that, when combined with a sampling-based approach to estimating EIG, REIG corresponds to a `log-sum-exp' stabilization of the samples used to estimate EIG, meaning that it can be efficiently implemented in practice. Numerical tests combining REIG with variational nested Monte Carlo (VNMC), adaptive contrastive estimation (ACE) and mutual information neural estimation (MINE) suggest that in practice REIG also compensates for the variability of under-sampled estimators.
△ Less
Submitted 19 May, 2022;
originally announced May 2022.
-
Fake News Quick Detection on Dynamic Heterogeneous Information Networks
Authors:
** Ho Go,
Alina Sari,
Jiaojiao Jiang,
Shuiqiao Yang,
Sanjay Jha
Abstract:
The spread of fake news has caused great harm to society in recent years. So the quick detection of fake news has become an important task. Some current detection methods often model news articles and other related components as a static heterogeneous information network (HIN) and use expensive message-passing algorithms. However, in the real-world, quickly identifying fake news is of great signif…
▽ More
The spread of fake news has caused great harm to society in recent years. So the quick detection of fake news has become an important task. Some current detection methods often model news articles and other related components as a static heterogeneous information network (HIN) and use expensive message-passing algorithms. However, in the real-world, quickly identifying fake news is of great significance and the network may vary over time in terms of dynamic nodes and edges. Therefore, in this paper, we propose a novel Dynamic Heterogeneous Graph Neural Network (DHGNN) for fake news quick detection. More specifically, we first implement BERT and fine-tuned BERT to get a semantic representation of the news article contents and author profiles and convert it into graph data. Then, we construct the heterogeneous news-author graph to reflect contextual information and relationships. Additionally, we adapt ideas from personalized PageRank propagation and dynamic propagation to heterogeneous networks in order to reduce the time complexity of back-propagating through many nodes during training. Experiments on three real-world fake news datasets show that DHGNN can outperform other GNN-based models in terms of both effectiveness and efficiency.
△ Less
Submitted 14 May, 2022;
originally announced May 2022.
-
Vortex shedding frequency of a moving obstacle in a Bose-Einstein condensate
Authors:
Younghoon Lim,
Yangheon Lee,
Junhong Goo,
Dalmin Bae,
Yong-il Shin
Abstract:
We experimentally investigate the periodic vortex shedding dynamics in a highly oblate Bose-Einstein condensate using a moving penetrable Gaussian obstacle. The shedding frequency $f_v$ is measured as a function of the obstacle velocity $v$ and characterized by a linear relationship of $f_v=a(v-v_c)$ with $v_c$ being the critical velocity. The proportionality constant $a$ is linearly decreased wit…
▽ More
We experimentally investigate the periodic vortex shedding dynamics in a highly oblate Bose-Einstein condensate using a moving penetrable Gaussian obstacle. The shedding frequency $f_v$ is measured as a function of the obstacle velocity $v$ and characterized by a linear relationship of $f_v=a(v-v_c)$ with $v_c$ being the critical velocity. The proportionality constant $a$ is linearly decreased with a decrease in the obstacle strength, whereas $v_c$ approaches the speed of sound. When the obstacle size increases, both $a$ and $v_c$ are decreased. The critical vortex shedding is further investigated for an oscillating obstacle and found to be consistent with the measured $f_v$. When the obstacle's maximum velocity exceeds $v_c$ but its oscillation amplitude is not large enough to create a vortex dipole, we observe that vortices are generated in the low-density boundary region of the trapped condensate, which is attributed to the phonon emission from the oscillating obstacle. Finally, we discuss a possible asymptotic association of $a$ with the Strouhal number in the context of universal shedding dynamics of a superfluid.
△ Less
Submitted 27 February, 2022;
originally announced February 2022.
-
Universal Early Coarsening of Quenched Bose Gases
Authors:
Junhong Goo,
Yangheon Lee,
Younghoon Lim,
Dalmin Bae,
Tenzin Rabga,
Yong-il Shin
Abstract:
We investigate the early coarsening dynamics of an atomic Bose gas quenched into a superfluid phase. Using a two-step quench protocol, we effectively control the cooling rates, $r_1$ and $r_2$, during and after passing through the critical region, respectively, and measure the number of quantum vortices spontaneously created in the system. The latter cooling rate $r_2$ regulates the temperature du…
▽ More
We investigate the early coarsening dynamics of an atomic Bose gas quenched into a superfluid phase. Using a two-step quench protocol, we effectively control the cooling rates, $r_1$ and $r_2$, during and after passing through the critical region, respectively, and measure the number of quantum vortices spontaneously created in the system. The latter cooling rate $r_2$ regulates the temperature during the condensate growth, consequently controlling the early coarsening dynamics in the defect formation. We find that the defect number shows a scaling behavior with $r_2$ regardless of the initial cooling rate $r_1$, indicating universal coarsening dynamics in the early stage of condensate growth. Our results demonstrate that early coarsening not only reduces the defect density but also affects its scaling with the quench rate, which is beyond the Kibble-Zurek mechanism.
△ Less
Submitted 10 December, 2021;
originally announced December 2021.
-
The RSNA-ASNR-MICCAI BraTS 2021 Benchmark on Brain Tumor Segmentation and Radiogenomic Classification
Authors:
Ujjwal Baid,
Satyam Ghodasara,
Suyash Mohan,
Michel Bilello,
Evan Calabrese,
Errol Colak,
Keyvan Farahani,
Jayashree Kalpathy-Cramer,
Felipe C. Kitamura,
Sarthak Pati,
Luciano M. Prevedello,
Jeffrey D. Rudie,
Chiharu Sako,
Russell T. Shinohara,
Timothy Bergquist,
Rong Chai,
James Eddy,
Julia Elliott,
Walter Reade,
Thomas Schaffter,
Thomas Yu,
Jiaxin Zheng,
Ahmed W. Moawad,
Luiz Otavio Coelho,
Olivia McDonnell
, et al. (78 additional authors not shown)
Abstract:
The BraTS 2021 challenge celebrates its 10th anniversary and is jointly organized by the Radiological Society of North America (RSNA), the American Society of Neuroradiology (ASNR), and the Medical Image Computing and Computer Assisted Interventions (MICCAI) society. Since its inception, BraTS has been focusing on being a common benchmarking venue for brain glioma segmentation algorithms, with wel…
▽ More
The BraTS 2021 challenge celebrates its 10th anniversary and is jointly organized by the Radiological Society of North America (RSNA), the American Society of Neuroradiology (ASNR), and the Medical Image Computing and Computer Assisted Interventions (MICCAI) society. Since its inception, BraTS has been focusing on being a common benchmarking venue for brain glioma segmentation algorithms, with well-curated multi-institutional multi-parametric magnetic resonance imaging (mpMRI) data. Gliomas are the most common primary malignancies of the central nervous system, with varying degrees of aggressiveness and prognosis. The RSNA-ASNR-MICCAI BraTS 2021 challenge targets the evaluation of computational algorithms assessing the same tumor compartmentalization, as well as the underlying tumor's molecular characterization, in pre-operative baseline mpMRI data from 2,040 patients. Specifically, the two tasks that BraTS 2021 focuses on are: a) the segmentation of the histologically distinct brain tumor sub-regions, and b) the classification of the tumor's O[6]-methylguanine-DNA methyltransferase (MGMT) promoter methylation status. The performance evaluation of all participating algorithms in BraTS 2021 will be conducted through the Sage Bionetworks Synapse platform (Task 1) and Kaggle (Task 2), concluding in distributing to the top ranked participants monetary awards of $60,000 collectively.
△ Less
Submitted 12 September, 2021; v1 submitted 5 July, 2021;
originally announced July 2021.
-
Defect Saturation in a Rapidly Quenched Bose Gas
Authors:
Junhong Goo,
Younghoon Lim,
Yong-il Shin
Abstract:
We investigate the saturation of defect density in an atomic Bose gas rapidly cooled into a superfluid phase. The number of quantum vortices, which are spontaneously created in the quenched gas, exhibits a Poissonian distribution not only for a slow quench in the Kibble-Zurek (KZ) scaling regime but also for a fast quench in which case the mean vortex number is saturated. This shows that the satur…
▽ More
We investigate the saturation of defect density in an atomic Bose gas rapidly cooled into a superfluid phase. The number of quantum vortices, which are spontaneously created in the quenched gas, exhibits a Poissonian distribution not only for a slow quench in the Kibble-Zurek (KZ) scaling regime but also for a fast quench in which case the mean vortex number is saturated. This shows that the saturation is not caused by destructive vortex collisions, but by the early-time coarsening in an emerging condensate, which is further supported by the observation that the condensate growth lags the quenching in the saturation regime. Our results demonstrate that the defect saturation is an effect beyond the KZ mechanism, opening a path for studying critical phase transition dynamics using the defect number distribution.
△ Less
Submitted 6 August, 2021; v1 submitted 13 May, 2021;
originally announced May 2021.
-
Large-area $^{87}$Rb Bose-Einstein condensate in a clipped-Gaussian optical dipole trap
Authors:
Younghoon Lim,
Junhong Goo,
Haneul Kwak,
Yong-il Shin
Abstract:
We demonstrate a production of large-area $^{87}$Rb Bose-Einstein condensates (BECs) using a non-Gaussian optical dipole trap (ODT). The ODT is formed by focusing a symmetrically truncated Gaussian laser beam and it is shown that the beam clip** causes the trap geometry elongated and flattened along the beam axis direction. In the clipped-Gaussian ODT, an elongated, highly oblate BEC of $^{87}$R…
▽ More
We demonstrate a production of large-area $^{87}$Rb Bose-Einstein condensates (BECs) using a non-Gaussian optical dipole trap (ODT). The ODT is formed by focusing a symmetrically truncated Gaussian laser beam and it is shown that the beam clip** causes the trap geometry elongated and flattened along the beam axis direction. In the clipped-Gaussian ODT, an elongated, highly oblate BEC of $^{87}$Rb is generated with length and width of approximately $470~μ\textrm{m}$ and $130~μ\textrm{m}$, respectively, where the condensate healing length is estimated to be $ξ\approx 0.25~μ\textrm{m}$ at the trap center. The ODT is characterized to have a quartic trap** potential along the beam axis and the atom density of the condensate is uniform within 10% over $1000ξ$ in the central region. Finally, we discuss the prospect of conducting vortex shedding experiments using the elongated condensate.
△ Less
Submitted 19 April, 2021;
originally announced April 2021.
-
Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention
Authors:
Myunghun Jung,
Youngmoon Jung,
Jahyun Goo,
Hoirin Kim
Abstract:
Keyword spotting (KWS) and speaker verification (SV) have been studied independently although it is known that acoustic and speaker domains are complementary. In this paper, we propose a multi-task network that performs KWS and SV simultaneously to fully utilize the interrelated domain information. The multi-task network tightly combines sub-networks aiming at performance improvement in challengin…
▽ More
Keyword spotting (KWS) and speaker verification (SV) have been studied independently although it is known that acoustic and speaker domains are complementary. In this paper, we propose a multi-task network that performs KWS and SV simultaneously to fully utilize the interrelated domain information. The multi-task network tightly combines sub-networks aiming at performance improvement in challenging conditions such as noisy environments, open-vocabulary KWS, and short-duration SV, by introducing novel techniques of connectionist temporal classification (CTC)-based soft voice activity detection (VAD) and global query attention. Frame-level acoustic and speaker information is integrated with phonetically originated weights so that forms a word-level global representation. Then it is used for the aggregation of feature vectors to generate discriminative embeddings. Our proposed approach shows 4.06% and 26.71% relative improvements in equal error rate (EER) compared to the baselines for both tasks. We also present a visualization example and results of ablation experiments.
△ Less
Submitted 7 August, 2020; v1 submitted 8 May, 2020;
originally announced May 2020.
-
Additional Shared Decoder on Siamese Multi-view Encoders for Learning Acoustic Word Embeddings
Authors:
Myunghun Jung,
Hyungjun Lim,
Jahyun Goo,
Youngmoon Jung,
Hoirin Kim
Abstract:
Acoustic word embeddings --- fixed-dimensional vector representations of arbitrary-length words --- have attracted increasing interest in query-by-example spoken term detection. Recently, on the fact that the orthography of text labels partly reflects the phonetic similarity between the words' pronunciation, a multi-view approach has been introduced that jointly learns acoustic and text embeddings…
▽ More
Acoustic word embeddings --- fixed-dimensional vector representations of arbitrary-length words --- have attracted increasing interest in query-by-example spoken term detection. Recently, on the fact that the orthography of text labels partly reflects the phonetic similarity between the words' pronunciation, a multi-view approach has been introduced that jointly learns acoustic and text embeddings. It showed that it is possible to learn discriminative embeddings by designing the objective which takes text labels as well as word segments. In this paper, we propose a network architecture that expands the multi-view approach by combining the Siamese multi-view encoders with a shared decoder network to maximize the effect of the relationship between acoustic and text embeddings in embedding space. Discriminatively trained with multi-view triplet loss and decoding loss, our proposed approach achieves better performance on acoustic word discrimination task with the WSJ dataset, resulting in 11.1% relative improvement in average precision. We also present experimental results on cross-view word discrimination and word level speech recognition tasks.
△ Less
Submitted 1 October, 2019;
originally announced October 2019.
-
Unclogging Our Arteries: Using Human-Inspired Signals to Disambiguate Navigational Intentions
Authors:
Justin Hart,
Reuth Mirsky,
Stone Tejeda,
Bonny Mahajan,
Jamin Goo,
Kathryn Baldauf,
Sydney Owen,
Peter Stone
Abstract:
People are proficient at communicating their intentions in order to avoid conflicts when navigating in narrow, crowded environments. In many situations mobile robots lack both the ability to interpret human intentions and the ability to clearly communicate their own intentions to people sharing their space. This work addresses the second of these points, leveraging insights about how people implic…
▽ More
People are proficient at communicating their intentions in order to avoid conflicts when navigating in narrow, crowded environments. In many situations mobile robots lack both the ability to interpret human intentions and the ability to clearly communicate their own intentions to people sharing their space. This work addresses the second of these points, leveraging insights about how people implicitly communicate with each other through observations of behaviors such as gaze to provide mobile robots with better social navigation skills. In a preliminary human study, the importance of gaze as a signal used by people to interpret each-other's intentions during navigation of a shared space is observed. This study is followed by the development of a virtual agent head which is mounted to the top of the chassis of the BWIBot mobile robot platform. Contrasting the performance of the virtual agent head against an LED turn signal demonstrates that the naturalistic, implicit gaze cue is more easily interpreted than the LED turn signal.
△ Less
Submitted 6 November, 2019; v1 submitted 14 September, 2019;
originally announced September 2019.
-
2018 Low-Power Image Recognition Challenge
Authors:
Sergei Alyamkin,
Matthew Ardi,
Achille Brighton,
Alexander C. Berg,
Yiran Chen,
Hsin-Pai Cheng,
Bo Chen,
Zichen Fan,
Chen Feng,
Bo Fu,
Kent Gauen,
Jongkook Go,
Alexander Goncharenko,
Xuyang Guo,
Hong Hanh Nguyen,
Andrew Howard,
Yuanjun Huang,
Donghyun Kang,
Jaeyoun Kim,
Alexander Kondratyev,
Seungjae Lee,
Suwoong Lee,
Junhyeok Lee,
Zhiyu Liang,
Xin Liu
, et al. (16 additional authors not shown)
Abstract:
The Low-Power Image Recognition Challenge (LPIRC, https://rebootingcomputing.ieee.org/lpirc) is an annual competition started in 2015. The competition identifies the best technologies that can classify and detect objects in images efficiently (short execution time and low energy consumption) and accurately (high precision). Over the four years, the winners' scores have improved more than 24 times.…
▽ More
The Low-Power Image Recognition Challenge (LPIRC, https://rebootingcomputing.ieee.org/lpirc) is an annual competition started in 2015. The competition identifies the best technologies that can classify and detect objects in images efficiently (short execution time and low energy consumption) and accurately (high precision). Over the four years, the winners' scores have improved more than 24 times. As computer vision is widely used in many battery-powered systems (such as drones and mobile phones), the need for low-power computer vision will become increasingly important. This paper summarizes LPIRC 2018 by describing the three different tracks and the winners' solutions.
△ Less
Submitted 3 October, 2018;
originally announced October 2018.
-
Fast Desktop-Scale Extrusion Additive Manufacturing
Authors:
Jamison Go,
A. John Hart
Abstract:
Significant improvements to the production rate of additive manufacturing (AM) technologies are essential to their cost-effectiveness and competitiveness with traditional processing routes. Moreover, much faster AM processes, in combination with the geometric versatility of AM, will enable entirely new workflows for product design and customization. We present the design and validation of a deskto…
▽ More
Significant improvements to the production rate of additive manufacturing (AM) technologies are essential to their cost-effectiveness and competitiveness with traditional processing routes. Moreover, much faster AM processes, in combination with the geometric versatility of AM, will enable entirely new workflows for product design and customization. We present the design and validation of a desktop-scale extrusion AM system that achieves far greater build rate than benchmarked commercial systems. This system, which we call FastFFF, is motivated by our recent analysis of the rate-limiting mechanisms to conventional fused filament fabrication (FFF) technology. The FastFFF system mutually overcomes these limits, using a nut-feed extruder, laser-heated polymer liquefier, and servo-driven parallel gantry system to achieve high extrusion force, rapid filament heating, and fast gantry motion, respectively. The extrusion and heating mechanisms are contained in a compact printhead that receives a threaded filament and augments conduction heat transfer with a fiber-coupled diode laser. The prototype system achieves a volumetric build rate of 127 cm3/hr, which is 7-fold greater than commercial desktop FFF systems, at comparable resolution; the maximum extrusion rate of the printhead is ~14-fold greater (282 cm3/hr). The performance limits of the printhead and motion systems are characterized, and the tradeoffs between build rate and resolution are assessed and discussed. The combination of high-speed motion and high deposition rate achieved by the FastFFF technology also poses challenges and opportunities for toolpath optimization and real-time deposition control. High-speed desktop printing raises the possibility of new use cases and business models for AM, where handheld parts are built in minutes rather than hours.
△ Less
Submitted 2 July, 2017;
originally announced September 2017.
-
One-loop divergences of quantum gravity coupled with scalar electrodynamics
Authors:
Hyun Ju Go
Abstract:
In non-supersymmetric covariant quantum gravity theory, for each system of gravity coupled with single field is one-loop divergent. Since adding other fields or other interactions to each system generates more possible counter-Lagrangian terms, there is room for improvement to restore renormalizability. In this paper, we consider Einstein-Maxwell fields coupled with electrically charged scalar whi…
▽ More
In non-supersymmetric covariant quantum gravity theory, for each system of gravity coupled with single field is one-loop divergent. Since adding other fields or other interactions to each system generates more possible counter-Lagrangian terms, there is room for improvement to restore renormalizability. In this paper, we consider Einstein-Maxwell fields coupled with electrically charged scalar which is the simplest model among the systems of gravity coupled with multiple fields having their own interaction. First, we introduce how to calculate the possible one-loop diagrams in Einstein-SQED system and show that this system is non-renormalizable.
△ Less
Submitted 10 December, 2017; v1 submitted 30 June, 2016;
originally announced June 2016.
-
A Framework for Teaching the Fundamentals of Additive Manufacturing and Enabling Rapid Innovation
Authors:
Jamison Go,
A. John Hart
Abstract:
The importance of additive manufacturing (AM) to the future of product design and manufacturing systems demands educational programs tailored to embrace its fundamental principles and its innovative potential. Moreover, the breadth and depth of AM spans several traditional disciplines, presenting a challenge to instructors yet giving the opportunity to integrate knowledge via creative and challeng…
▽ More
The importance of additive manufacturing (AM) to the future of product design and manufacturing systems demands educational programs tailored to embrace its fundamental principles and its innovative potential. Moreover, the breadth and depth of AM spans several traditional disciplines, presenting a challenge to instructors yet giving the opportunity to integrate knowledge via creative and challenging projects. This paper presents our approach to teaching AM at the graduate and advanced undergraduate level, in the form of a 15-week course developed and taught at the Massachusetts Institute of Technology. The lectures begin with in-depth technical analysis of the major AM processes. In lab sessions, students operate and characterize desktop AM machines, and work in teams to design and fabricate a bridge having maximum strength per unit weight while conforming to geometric constraints. To conclude the semester, teams created prototype machines for printing molten glass, soft serve ice cream, biodegradable material, and carbon fiber composites, as well as for large area parallel extrusion and for in situ optical scanning during printing. We conclude that AM education, while arguably rooted in mechanical engineering, is truly multidisciplinary, and that education programs must embrace this context.
△ Less
Submitted 30 October, 2015;
originally announced October 2015.
-
Statistical Interpretation of Femto-Molar Detection
Authors:
Jonghyun Go,
Muhammad A. Alam
Abstract:
Over the last decade, many experiments have demonstrated that nanobiosensors based on Nanotubes and Nanowires are significantly more sensitive compared to their planar counterparts. Yet, there has been persistent gap between reports of analyte detection at ~femto-Molar concentration and theory suggesting the impossibility of sub-pM detection at the corresponding incubation time. This divide has…
▽ More
Over the last decade, many experiments have demonstrated that nanobiosensors based on Nanotubes and Nanowires are significantly more sensitive compared to their planar counterparts. Yet, there has been persistent gap between reports of analyte detection at ~femto-Molar concentration and theory suggesting the impossibility of sub-pM detection at the corresponding incubation time. This divide has persisted despite the sophistication of the theoretical models. In this paper, we calculate the statistics of diffusion-limited arrival-time distribution by a Monte Carlo method to suggest a statistical resolution of the enduring puzzle: The incubation time in the theory is the mean incubation time, while experiments suggest device stability limited the minimum incubation time. The difference in incubation times - both described by characteristic power-laws - provides an intuitive explanation of different detection limits anticipated by theory and experiments. These power laws broaden the scope of problems amenable to the first-passage process used to quantify the stochastic biological processes.
△ Less
Submitted 28 January, 2009;
originally announced January 2009.