Skip to main content

Showing 1–11 of 11 results for author: Tan, J H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05355  [pdf, other

    cs.CV cs.RO

    Geometry-Informed Distance Candidate Selection for Adaptive Lightweight Omnidirectional Stereo Vision with Fisheye Images

    Authors: Conner Pulling, Je Hon Tan, Yaoyu Hu, Sebastian Scherer

    Abstract: Multi-view stereo omnidirectional distance estimation usually needs to build a cost volume with many hypothetical distance candidates. The cost volume building process is often computationally heavy considering the limited resources a mobile robot has. We propose a new geometry-informed way of distance candidates selection method which enables the use of a very small number of candidates and reduc… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  2. arXiv:2402.03752  [pdf, other

    cs.CV cs.LG

    Pre-training of Lightweight Vision Transformers on Small Datasets with Minimally Scaled Images

    Authors: Jen Hong Tan

    Abstract: Can a lightweight Vision Transformer (ViT) match or exceed the performance of Convolutional Neural Networks (CNNs) like ResNet on small datasets with small image resolutions? This report demonstrates that a pure ViT can indeed achieve superior performance through pre-training, using a masked auto-encoder technique with minimal image scaling. Our experiments on the CIFAR-10 and CIFAR-100 datasets i… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: 7 pages, 6 figures

  3. arXiv:2202.05451  [pdf, other

    cs.CV cs.CL cs.LG

    ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning

    Authors: Jia Huei Tan, Ying Hua Tan, Chee Seng Chan, Joon Huang Chuah

    Abstract: Recent research that applies Transformer-based architectures to image captioning has resulted in state-of-the-art image captioning performance, capitalising on the success of Transformers on natural language tasks. Unfortunately, though these models work well, one major flaw is their large model sizes. To this end, we present three parameter reduction methods for image captioning Transformers: Rad… ▽ More

    Submitted 11 February, 2022; originally announced February 2022.

    Comments: Neurocomputing; In Press

  4. arXiv:2112.14574  [pdf

    eess.SY cs.CY cs.HC cs.RO

    Industry 4.0: Challenges and success factors for adopting digital technologies in airports

    Authors: Jia Hao Tan, Tariq Masood

    Abstract: With the advent of Industry 4.0 technologies in the last decade, airports have undergone digitalisation to capitalise on the purported benefits of these technologies such as improved operational efficiency and passenger experience. The ongoing COVID-19 pandemic with emergence of its variants (e.g. Delta, Omicron) has exacerbated the need for airports to adopt new technologies such as contactless a… ▽ More

    Submitted 29 December, 2021; originally announced December 2021.

    Comments: 25 pages, 4 figures, 9 tables

  5. arXiv:2112.14333  [pdf

    eess.SY cs.HC cs.RO

    Adoption of Industry 4.0 technologies in airports -- A systematic literature review

    Authors: Jia Hao Tan, Tariq Masood

    Abstract: Airports have been constantly evolving and adopting digital technologies to improve operational efficiency, enhance passenger experience, generate ancillary revenues and boost capacity from existing infrastructure. The COVID-19 pandemic has also challenged airports and aviation stakeholders alike to adapt and manage new operational challenges such as facilitating a contactless travel experience an… ▽ More

    Submitted 28 December, 2021; originally announced December 2021.

    Comments: 25 pages, 2 figures, 2 tables, 106 references

  6. arXiv:2112.13384  [pdf, other

    cs.LG cs.MM cs.SI

    Will You Dance To The Challenge? Predicting User Participation of TikTok Challenges

    Authors: Lynnette Hui Xian Ng, John Yeh Han Tan, Darryl **g Heng Tan, Roy Ka-Wei Lee

    Abstract: TikTok is a popular new social media, where users express themselves through short video clips. A common form of interaction on the platform is participating in "challenges", which are songs and dances for users to iterate upon. Challenge contagion can be measured through replication reach, i.e., users uploading videos of their participation in the challenges. The uniqueness of the TikTok platform… ▽ More

    Submitted 26 December, 2021; originally announced December 2021.

    Comments: Accepted at ASONAM 2021

  7. End-to-End Supermask Pruning: Learning to Prune Image Captioning Models

    Authors: Jia Huei Tan, Chee Seng Chan, Joon Huang Chuah

    Abstract: With the advancement of deep models, research work on image captioning has led to a remarkable gain in raw performance over the last decade, along with increasing model complexity and computational cost. However, surprisingly works on compression of deep networks for image captioning task has received little to no attention. For the first time in image captioning research, we provide an extensive… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

    Comments: Pattern Recognition; In Press

  8. arXiv:1908.10797  [pdf, other

    cs.CV cs.CL cs.LG

    Image Captioning with Sparse Recurrent Neural Network

    Authors: Jia Huei Tan, Chee Seng Chan, Joon Huang Chuah

    Abstract: Recurrent Neural Network (RNN) has been widely used to tackle a wide variety of language generation problems and are capable of attaining state-of-the-art (SOTA) performance. However despite its impressive results, the large number of parameters in the RNN model makes deployment to mobile and embedded devices infeasible. Driven by this problem, many works have proposed a number of pruning methods… ▽ More

    Submitted 28 October, 2019; v1 submitted 28 August, 2019; originally announced August 2019.

    Comments: Corrected Eq 11, updated Table 5

  9. COMIC: Towards A Compact Image Captioning Model with Attention

    Authors: Jia Huei Tan, Chee Seng Chan, Joon Huang Chuah

    Abstract: Recent works in image captioning have shown very promising raw performance. However, we realize that most of these encoder-decoder style networks with attention do not scale naturally to large vocabulary size, making them difficult to be deployed on embedded system with limited hardware resources. This is because the size of word and output embedding matrices grow proportionally with the size of v… ▽ More

    Submitted 11 June, 2019; v1 submitted 4 March, 2019; originally announced March 2019.

    Comments: Added source code link and new results in Table 3

  10. arXiv:1702.00509  [pdf

    cs.CV cs.LG

    Segmentation of optic disc, fovea and retinal vasculature using a single convolutional neural network

    Authors: Jen Hong Tan, U. Rajendra Acharya, Sulatha V. Bhandary, Kuang Chua Chua, Sobha Sivaprasad

    Abstract: We have developed and trained a convolutional neural network to automatically and simultaneously segment optic disc, fovea and blood vessels. Fundus images were normalised before segmentation was performed to enforce consistency in background lighting and contrast. For every effective point in the fundus image, our algorithm extracted three channels of input from the neighbourhood of the point and… ▽ More

    Submitted 1 February, 2017; originally announced February 2017.

  11. arXiv:1402.6387  [pdf

    cs.CV

    Active spline model: A shape based model-interactive segmentation

    Authors: Jen Hong Tan, U. Rajendra Acharya

    Abstract: Rarely in literature a method of segmentation cares for the edit after the algorithm delivers. They provide no solution when segmentation goes wrong. We propose to formulate point distribution model in terms of centripetal-parameterized Catmull-Rom spline. Such fusion brings interactivity to model-based segmentation, so that edit is better handled. When the delivered segment is unsatisfactory, use… ▽ More

    Submitted 25 February, 2014; originally announced February 2014.

    Comments: submitted to Computers in biology and Medicine, second revision