-
Enhancing Video Transformers for Action Understanding with VLM-aided Training
Authors:
Hui Lu,
Hu Jian,
Ronald Poppe,
Albert Ali Salah
Abstract:
Owing to their ability to extract relevant spatio-temporal video embeddings, Vision Transformers (ViTs) are currently the best performing models in video action understanding. However, their generalization over domains or datasets is somewhat limited. In contrast, Visual Language Models (VLMs) have demonstrated exceptional generalization performance, but are currently unable to process videos. Con…
▽ More
Owing to their ability to extract relevant spatio-temporal video embeddings, Vision Transformers (ViTs) are currently the best performing models in video action understanding. However, their generalization over domains or datasets is somewhat limited. In contrast, Visual Language Models (VLMs) have demonstrated exceptional generalization performance, but are currently unable to process videos. Consequently, they cannot extract spatio-temporal patterns that are crucial for action understanding. In this paper, we propose the Four-tiered Prompts (FTP) framework that takes advantage of the complementary strengths of ViTs and VLMs. We retain ViTs' strong spatio-temporal representation ability but improve the visual encodings to be more comprehensive and general by aligning them with VLM outputs. The FTP framework adds four feature processors that focus on specific aspects of human action in videos: action category, action components, action description, and context information. The VLMs are only employed during training, and inference incurs a minimal computation cost. Our approach consistently yields state-of-the-art performance. For instance, we achieve remarkable top-1 accuracy of 93.8% on Kinetics-400 and 83.4% on Something-Something V2, surpassing VideoMAEv2 by 2.8% and 2.6%, respectively.
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
Apache Submarine: A Unified Machine Learning Platform Made Simple
Authors:
Kai-Hsun Chen,
Huan-** Su,
Wei-Chiu Chuang,
Hung-Chang Hsiao,
Wangda Tan,
Zhankun Tang,
Xun Liu,
Yanbo Liang,
Wen-Chih Lo,
Wanqiang Ji,
Byron Hsu,
Keqiu Hu,
HuiYang Jian,
Quan Zhou,
Chien-Min Wang
Abstract:
As machine learning is applied more widely, it is necessary to have a machine learning platform for both infrastructure administrators and users including expert data scientists and citizen data scientists to improve their productivity. However, existing machine learning platforms are ill-equipped to address the "Machine Learning tech debts" such as glue code, reproducibility, and portability. Fur…
▽ More
As machine learning is applied more widely, it is necessary to have a machine learning platform for both infrastructure administrators and users including expert data scientists and citizen data scientists to improve their productivity. However, existing machine learning platforms are ill-equipped to address the "Machine Learning tech debts" such as glue code, reproducibility, and portability. Furthermore, existing platforms only take expert data scientists into consideration, and thus they are inflexible for infrastructure administrators and non-user-friendly for citizen data scientists. We propose Submarine, a unified machine learning platform, to address the challenges.
△ Less
Submitted 21 August, 2021;
originally announced August 2021.
-
In Search of a Key Value Store with High Performance and High Availability
Authors:
Huaibing Jian,
Yuean Zhu,
Yongchao Long,
Bin Li,
Shu Wang,
Xiliang Wu,
Zhichu Zhong
Abstract:
In recent year, the write-heavy applications is more and more prevalent. How to efficiently handle this sort of workload is one of intensive research direction in the field of database system. The overhead caused by write operation is mainly issued by two reasons: 1) the hardware level, i.e., the IO cost caused by logging. We can't remove this cost in short period 2) the dual-copy software archite…
▽ More
In recent year, the write-heavy applications is more and more prevalent. How to efficiently handle this sort of workload is one of intensive research direction in the field of database system. The overhead caused by write operation is mainly issued by two reasons: 1) the hardware level, i.e., the IO cost caused by logging. We can't remove this cost in short period 2) the dual-copy software architecture and serial replay. The born of log as database architecture is originated to overcome the software defect. But existing systems treating log as database either are built on top of special infrastructure such as infiniband or NVRam (Non-Volatile Random access memory) which is far from widely available or are constructed with the help of other system such as Dynamo which is lack of flexibility. In this paper we build only write-once key-value system called LogStore from scratch to support our instant messenger business. The key features of LogStore include: 1) a single thread per partition executing mode, which eliminates the concurrency overhead; 2) log as database to enable write-once feature and freshness on the standby. We achieve high availability by embedding replication protocol other than dependent on other infrastructure; 3) fine-grained and low overhead data buffer pool management to effectively minimize IO cost. According to our empirical evaluations LogStore has good performance in write operation, recovery and replication
△ Less
Submitted 17 April, 2019;
originally announced April 2019.
-
AME Blockchain: An Architecture Design for Closed-Loop Fluid Economy Token System
Authors:
Lanny Z. N. Yuan,
Huaibing Jian,
Peng Liu,
Pengxin Zhu,
ShanYang Fu
Abstract:
In this white paper, we propose a blockchain-based system, named AME, which is a decentralized infrastructure and application platform with enhanced security and self-management properties. The AME blockchain technology aims to increase the transaction throughput by adopting various optimizations in network transport and storage layers, and to enhance smart contracts with AI algorithm support. We…
▽ More
In this white paper, we propose a blockchain-based system, named AME, which is a decentralized infrastructure and application platform with enhanced security and self-management properties. The AME blockchain technology aims to increase the transaction throughput by adopting various optimizations in network transport and storage layers, and to enhance smart contracts with AI algorithm support. We introduce all major technologies adopted in our system, including blockchain, distributed storage, P2P network, service application framework, and data encryption. To properly provide a cohesive, concise, yet comprehensive introduction to the AME system, we mainly focus on describing the unique definitions and features that guide the system implementation.
△ Less
Submitted 18 December, 2018;
originally announced December 2018.