Skip to main content

Showing 1–17 of 17 results for author: Thai, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.06395  [pdf, other

    cs.CL cs.LG

    MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies

    Authors: Shengding Hu, Yuge Tu, Xu Han, Chaoqun He, Ganqu Cui, Xiang Long, Zhi Zheng, Yewei Fang, Yuxiang Huang, Weilin Zhao, Xinrong Zhang, Zheng Leng Thai, Kaihuo Zhang, Chongyi Wang, Yuan Yao, Chenyang Zhao, Jie Zhou, Jie Cai, Zhongwu Zhai, Ning Ding, Chao Jia, Guoyang Zeng, Dahai Li, Zhiyuan Liu, Maosong Sun

    Abstract: The burgeoning interest in develo** Large Language Models (LLMs) with up to trillion parameters has been met with concerns regarding resource efficiency and practical expense, particularly given the immense cost of experimentation. This scenario underscores the importance of exploring the potential of Small Language Models (SLMs) as a resource-efficient alternative. In this context, we introduce… ▽ More

    Submitted 3 June, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: revise according to peer review

  2. arXiv:2402.14008  [pdf, other

    cs.CL

    OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems

    Authors: Chaoqun He, Renjie Luo, Yuzhuo Bai, Shengding Hu, Zhen Leng Thai, Junhao Shen, **yi Hu, Xu Han, Yujie Huang, Yuxiang Zhang, Jie Liu, Lei Qi, Zhiyuan Liu, Maosong Sun

    Abstract: Recent advancements have seen Large Language Models (LLMs) and Large Multimodal Models (LMMs) surpassing general human capabilities in various tasks, approaching the proficiency level of human experts across multiple domains. With traditional benchmarks becoming less challenging for these models, new rigorous challenges are essential to gauge their advanced abilities. In this work, we present Olym… ▽ More

    Submitted 6 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: Accepted by ACL 2024 (main), update

  3. arXiv:2402.13718  [pdf, other

    cs.CL

    $\infty$Bench: Extending Long Context Evaluation Beyond 100K Tokens

    Authors: Xinrong Zhang, Yingfa Chen, Shengding Hu, Zihang Xu, Junhao Chen, Moo Khai Hao, Xu Han, Zhen Leng Thai, Shuo Wang, Zhiyuan Liu, Maosong Sun

    Abstract: Processing and reasoning over long contexts is crucial for many practical applications of Large Language Models (LLMs), such as document comprehension and agent construction. Despite recent strides in making LLMs process contexts with more than 100K tokens, there is currently a lack of a standardized benchmark to evaluate this long-context capability. Existing public benchmarks typically focus on… ▽ More

    Submitted 24 February, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Journal ref: 2023.12.15ARR

  4. arXiv:1902.02823  [pdf, other

    cs.LG stat.ML

    Compatible Natural Gradient Policy Search

    Authors: Joni Pajarinen, Hong Linh Thai, Riad Akrour, Jan Peters, Gerhard Neumann

    Abstract: Trust-region methods have yielded state-of-the-art results in policy search. A common approach is to use KL-divergence to bound the region of trust resulting in a natural gradient policy update. We show that the natural gradient and trust region optimization are equivalent if we use the natural parameterization of a standard exponential policy distribution in combination with compatible value func… ▽ More

    Submitted 7 February, 2019; originally announced February 2019.

  5. arXiv:1711.08973  [pdf, other

    cs.DC

    A Survey and Taxonomy of Resource Optimisation for Executing Bag-of-Task Applications on Public Clouds

    Authors: Long Thai, Blesson Varghese, Adam Barker

    Abstract: Cloud computing has been widely adopted due to the flexibility in resource provisioning and on-demand pricing models. Entire clusters of Virtual Machines (VMs) can be dynamically provisioned to meet the computational demands of users. However, from a user's perspective, it is still challenging to utilise cloud resources efficiently. This is because an overwhelmingly wide variety of resource types… ▽ More

    Submitted 24 November, 2017; originally announced November 2017.

    Comments: Accepted to Future Generation Computer Systems, 23 November 2017

  6. Cloud Benchmarking For Maximising Performance of Scientific Applications

    Authors: Blesson Varghese, Ozgur Akgun, Ian Miguel, Long Thai, Adam Barker

    Abstract: How can applications be deployed on the cloud to achieve maximum performance? This question is challenging to address with the availability of a wide variety of cloud Virtual Machines (VMs) with different performance capabilities. The research reported in this paper addresses the above question by proposing a six step benchmarking methodology in which a user provides a set of weights that indicate… ▽ More

    Submitted 1 August, 2016; originally announced August 2016.

    Comments: 14 pages, accepted to the IEEE Transactions on Cloud Computing on 31 July 2016

  7. DocLite: A Docker-Based Lightweight Cloud Benchmarking Tool

    Authors: Blesson Varghese, Lawan Thamsuhang Subba, Long Thai, Adam Barker

    Abstract: Existing benchmarking methods are time consuming processes as they typically benchmark the entire Virtual Machine (VM) in order to generate accurate performance data, making them less suitable for real-time analytics. The research in this paper is aimed to surmount the above challenge by presenting DocLite - Docker Container-based Lightweight benchmarking tool. DocLite explores lightweight cloud b… ▽ More

    Submitted 23 March, 2016; originally announced March 2016.

    Comments: 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), 2016, Cartagena, Colombia. arXiv admin note: substantial text overlap with arXiv:1601.03872

  8. Container-Based Cloud Virtual Machine Benchmarking

    Authors: Blesson Varghese, Lawan Thamsuhang Subba, Long Thai, Adam Barker

    Abstract: With the availability of a wide range of cloud Virtual Machines (VMs) it is difficult to determine which VMs can maximise the performance of an application. Benchmarking is commonly used to this end for capturing the performance of VMs. Most cloud benchmarking techniques are typically heavyweight - time consuming processes which have to benchmark the entire VM in order to obtain accurate benchmark… ▽ More

    Submitted 15 January, 2016; originally announced January 2016.

    Comments: Accepted to the IEEE International Conference on Cloud Engineering (IEEE IC2E), Berlin, Germany, 2016 - 10 pages

  9. Task Scheduling on the Cloud with Hard Constraints

    Authors: Long Thai, Blesson Varghese, Adam Barker

    Abstract: Scheduling Bag-of-Tasks (BoT) applications on the cloud can be more challenging than grid and cluster environ- ments. This is because a user may have a budgetary constraint or a deadline for executing the BoT application in order to keep the overall execution costs low. The research in this paper is motivated to investigate task scheduling on the cloud, given two hard constraints based on a user-d… ▽ More

    Submitted 20 July, 2015; originally announced July 2015.

    Comments: Visionary Track of the IEEE 11th World Congress on Services (IEEE SERVICES 2015)

  10. Budget Constrained Execution of Multiple Bag-of-Tasks Applications on the Cloud

    Authors: Long Thai, Blesson Varghese, Adam Barker

    Abstract: Optimising the execution of Bag-of-Tasks (BoT) applications on the cloud is a hard problem due to the trade- offs between performance and monetary cost. The problem can be further complicated when multiple BoT applications need to be executed. In this paper, we propose and implement a heuristic algorithm that schedules tasks of multiple applications onto different cloud virtual machines in order t… ▽ More

    Submitted 20 July, 2015; originally announced July 2015.

    Comments: 8th IEEE International Conference on Cloud Computing (CLOUD 2015)

  11. arXiv:1506.00590  [pdf, other

    cs.DC

    Executing Bag of Distributed Tasks on Virtually Unlimited Cloud Resources

    Authors: Long Thai, Blesson Varghese, Adam Barker

    Abstract: Bag-of-Distributed-Tasks (BoDT) application is the collection of identical and independent tasks each of which requires a piece of input data located around the world. As a result, Cloud computing offers an ef- fective way to execute BoT application as it not only consists of multiple geographically distributed data centres but also allows a user to pay for what she actually uses only. In this pap… ▽ More

    Submitted 1 June, 2015; originally announced June 2015.

  12. Cloud Services Brokerage: A Survey and Research Roadmap

    Authors: Adam Barker, Blesson Varghese, Long Thai

    Abstract: A Cloud Services Brokerage (CSB) acts as an intermediary between cloud service providers (e.g., Amazon and Google) and cloud service end users, providing a number of value adding services. CSBs as a research topic are in there infancy. The goal of this paper is to provide a concise survey of existing CSB technologies in a variety of areas and highlight a roadmap, which details five future opportun… ▽ More

    Submitted 1 June, 2015; originally announced June 2015.

    Comments: Paper published in the 8th IEEE International Conference on Cloud Computing (CLOUD 2015)

  13. Cloud Benchmarking for Performance

    Authors: Blesson Varghese, Ozgur Akgun, Ian Miguel, Long Thai, Adam Barker

    Abstract: How can applications be deployed on the cloud to achieve maximum performance? This question has become significant and challenging with the availability of a wide variety of Virtual Machines (VMs) with different performance capabilities in the cloud. The above question is addressed by proposing a six step benchmarking methodology in which a user provides a set of four weights that indicate how imp… ▽ More

    Submitted 4 November, 2014; originally announced November 2014.

    Comments: 6 pages, 6th IEEE International Conference on Cloud Computing Technology and Science (IEEE CloudCom) 2014, Singapore

  14. arXiv:1410.8359  [pdf, other

    cs.DC

    Optimal Deployment of Geographically Distributed Workflow Engines on the Cloud

    Authors: Long Thai, Adam Barker, Blesson Varghese, Ozgur Akgun, Ian Miguel

    Abstract: When orchestrating Web service workflows, the geographical placement of the orchestration engine(s) can greatly affect workflow performance. Data may have to be transferred across long geographical distances, which in turn increases execution time and degrades the overall performance of a workflow. In this paper, we present a framework that, given a DAG-based workflow specification, computes the o… ▽ More

    Submitted 30 October, 2014; originally announced October 2014.

  15. arXiv:1410.8357  [pdf, other

    cs.DC

    Executing Bag of Distributed Tasks on the Cloud: Investigating the Trade-offs Between Performance and Cost

    Authors: Long Thai, Blesson Varghese, Adam Barker

    Abstract: Bag of Distributed Tasks (BoDT) can benefit from decentralised execution on the Cloud. However, there is a trade-off between the performance that can be achieved by employing a large number of Cloud VMs for the tasks and the monetary constraints that are often placed by a user. The research reported in this paper is motivated towards investigating this trade-off so that an optimal plan for deployi… ▽ More

    Submitted 30 October, 2014; originally announced October 2014.

  16. arXiv:1111.4052  [pdf

    cs.CV

    A Facial Expression Classification System Integrating Canny, Principal Component Analysis and Artificial Neural Network

    Authors: Le Hoang Thai, Nguyen Do Thai Nguyen, Tran Son Hai

    Abstract: Facial Expression Classification is an interesting research problem in recent years. There are a lot of methods to solve this problem. In this research, we propose a novel approach using Canny, Principal Component Analysis (PCA) and Artificial Neural Network. Firstly, in preprocessing phase, we use Canny for local region detection of facial images. Then each of local region's features will be pres… ▽ More

    Submitted 17 November, 2011; originally announced November 2011.

    Comments: 6 pages, 10 figures, International Journal of Machine Learning and Computing, Vol. 1, No. 4, October 2011, ISSN (Online): 2010-3700, http://www.ijmlc.org/

    Journal ref: International Journal of Machine Learning and Computing, Vol. 1, No. 4, 2011, 388-393

  17. arXiv:1107.3194  [pdf

    cs.CV

    Fingerprint recognition using standardized fingerprint model

    Authors: Le Hoang Thai, Ha Nhat Tam

    Abstract: Fingerprint recognition is one of most popular and accuracy Biometric technologies. Nowadays, it is used in many real applications. However, recognizing fingerprints in poor quality images is still a very complex problem. In recent years, many algorithms, models...are given to improve the accuracy of recognition system. This paper discusses on the standardized fingerprint model which is used to sy… ▽ More

    Submitted 15 July, 2011; originally announced July 2011.

    Comments: 7 pages, 16 figures, 3 tables, IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 7, May 2010

    Journal ref: IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 7, May 2010, ISSN (Online): 1694-0784, ISSN (Print): 1694-0814