-
GenSERP: Large Language Models for Whole Page Presentation
Authors:
Zhenning Zhang,
Yunan Zhang,
Suyu Ge,
Guangwei Weng,
Mridu Narang,
Xia Song,
Saurabh Tiwary
Abstract:
The advent of large language models (LLMs) brings an opportunity to minimize the effort in search engine result page (SERP) organization. In this paper, we propose GenSERP, a framework that leverages LLMs with vision in a few-shot setting to dynamically organize intermediate search results, including generated chat answers, website snippets, multimedia data, knowledge panels into a coherent SERP l…
▽ More
The advent of large language models (LLMs) brings an opportunity to minimize the effort in search engine result page (SERP) organization. In this paper, we propose GenSERP, a framework that leverages LLMs with vision in a few-shot setting to dynamically organize intermediate search results, including generated chat answers, website snippets, multimedia data, knowledge panels into a coherent SERP layout based on a user's query. Our approach has three main stages: (1) An information gathering phase where the LLM continuously orchestrates API tools to retrieve different types of items, and proposes candidate layouts based on the retrieved items, until it's confident enough to generate the final result. (2) An answer generation phase where the LLM populates the layouts with the retrieved content. In this phase, the LLM adaptively optimize the ranking of items and UX configurations of the SERP. Consequently, it assigns a location on the page to each item, along with the UX display details. (3) A scoring phase where an LLM with vision scores all the generated SERPs based on how likely it can satisfy the user. It then send the one with highest score to rendering. GenSERP features two generation paradigms. First, coarse-to-fine, which allow it to approach optimal layout in a more manageable way, (2) beam search, which give it a better chance to hit the optimal solution compared to greedy decoding. Offline experimental results on real-world data demonstrate how LLMs can contextually organize heterogeneous search results on-the-fly and provide a promising user experience.
△ Less
Submitted 16 April, 2024; v1 submitted 22 February, 2024;
originally announced February 2024.
-
The BCH Family of Storage Codes on Triangle-Free Graphs is of Unit Rate
Authors:
Haihua Deng,
Hexiang Huang,
Guobiao Weng,
Qing Xiang
Abstract:
Let $Γ$ be a simple connected graph on $n$ vertices, and let $C$ be a code of length $n$ whose coordinates are indexed by the vertices of $Γ$. We say that $C$ is a \textit{storage code} on $Γ$ if for any codeword $c \in C$, one can recover the information on each coordinate of $c$ by accessing its neighbors in $Γ$. The main problem here is to construct high-rate storage codes on triangle-free grap…
▽ More
Let $Γ$ be a simple connected graph on $n$ vertices, and let $C$ be a code of length $n$ whose coordinates are indexed by the vertices of $Γ$. We say that $C$ is a \textit{storage code} on $Γ$ if for any codeword $c \in C$, one can recover the information on each coordinate of $c$ by accessing its neighbors in $Γ$. The main problem here is to construct high-rate storage codes on triangle-free graphs. In this paper, we solve an open problem posed by Barg and Zémor in 2022, showing that the BCH family of storage codes is of unit rate. Furthermore, we generalize the construction of the BCH family and obtain more storage codes of unit rate on triangle-free graphs.
△ Less
Submitted 7 October, 2023;
originally announced October 2023.
-
Real-time pedestrian recognition on low computational resources
Authors:
Guifan Weng
Abstract:
Pedestrian recognition has successfully been applied to security, autonomous cars, Aerial photographs. For most applications, pedestrian recognition on small mobile devices is important. However, the limitations of the computing hardware make this a challenging task. In this work, we investigate real-time pedestrian recognition on small physical-size computers with low computational resources for…
▽ More
Pedestrian recognition has successfully been applied to security, autonomous cars, Aerial photographs. For most applications, pedestrian recognition on small mobile devices is important. However, the limitations of the computing hardware make this a challenging task. In this work, we investigate real-time pedestrian recognition on small physical-size computers with low computational resources for faster speed. This paper presents three methods that work on the small physical size CPUs system. First, we improved the Local Binary Pattern (LBP) features and Adaboost classifier. Second, we optimized the Histogram of Oriented Gradients (HOG) and Support Vector Machine. Third, We implemented fast Convolutional Neural Networks (CNNs). The results demonstrate that the three methods achieved real-time pedestrian recognition at an accuracy of more than 95% and a speed of more than 5 fps on a small physical size computational platform with a 1.8 GHz Intel i5 CPU. Our methods can be easily applied to small mobile devices with high compatibility and generality.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
Action Recognition based on Cross-Situational Action-object Statistics
Authors:
Satoshi Tsutsui,
Xizi Wang,
Guangyuan Weng,
Yayun Zhang,
David Crandall,
Chen Yu
Abstract:
Machine learning models of visual action recognition are typically trained and tested on data from specific situations where actions are associated with certain objects. It is an open question how action-object associations in the training set influence a model's ability to generalize beyond trained situations. We set out to identify properties of training data that lead to action recognition mode…
▽ More
Machine learning models of visual action recognition are typically trained and tested on data from specific situations where actions are associated with certain objects. It is an open question how action-object associations in the training set influence a model's ability to generalize beyond trained situations. We set out to identify properties of training data that lead to action recognition models with greater generalization ability. To do this, we take inspiration from a cognitive mechanism called cross-situational learning, which states that human learners extract the meaning of concepts by observing instances of the same concept across different situations. We perform controlled experiments with various types of action-object associations, and identify key properties of action-object co-occurrence in training data that lead to better classifiers. Given that these properties are missing in the datasets that are typically used to train action classifiers in the computer vision literature, our work provides useful insights on how we should best construct datasets for efficiently training for better generalization.
△ Less
Submitted 15 August, 2022;
originally announced August 2022.
-
Advanced Map** Robot and High-Resolution Dataset
Authors:
Hongyu Chen,
Zhijie Yang,
Xiting Zhao,
Guangyuan Weng,
Haochuan Wan,
Jianwen Luo,
Xiaoya Ye,
Zehao Zhao,
Zhenpeng He,
Yongxia Shen,
Sören Schwertfeger
Abstract:
This paper presents a fully hardware synchronized map** robot with support for a hardware synchronized external tracking system, for super-precise timing and localization. Nine high-resolution cameras and two 32-beam 3D Lidars were used along with a professional, static 3D scanner for ground truth map collection. With all the sensors calibrated on the map** robot, three datasets are collected…
▽ More
This paper presents a fully hardware synchronized map** robot with support for a hardware synchronized external tracking system, for super-precise timing and localization. Nine high-resolution cameras and two 32-beam 3D Lidars were used along with a professional, static 3D scanner for ground truth map collection. With all the sensors calibrated on the map** robot, three datasets are collected to evaluate the performance of map** algorithms within a room and between rooms. Based on these datasets we generate maps and trajectory data, which is then fed into evaluation algorithms. We provide the datasets for download and the map** and evaluation procedures are made in a very easily reproducible manner for maximum comparability. We have also conducted a survey on available robotics-related datasets and compiled a big table with those datasets and a number of properties of them.
△ Less
Submitted 23 July, 2020;
originally announced July 2020.
-
Towards Generation and Evaluation of Comprehensive Map** Robot Datasets
Authors:
Hongyu Chen,
Xiting Zhao,
Jianwen Luo,
Zhijie Yang,
Zehao Zhao,
Haochuan Wan,
Xiaoya Ye,
Guangyuan Weng,
Zhenpeng He,
Tian Dong,
Sören Schwertfeger
Abstract:
This paper presents a fully hardware synchronized map** robot with support for a hardware synchronized external tracking system, for super-precise timing and localization. We also employ a professional, static 3D scanner for ground truth map collection. Three datasets are generated to evaluate the performance of map** algorithms within a room and between rooms. Based on these datasets we gener…
▽ More
This paper presents a fully hardware synchronized map** robot with support for a hardware synchronized external tracking system, for super-precise timing and localization. We also employ a professional, static 3D scanner for ground truth map collection. Three datasets are generated to evaluate the performance of map** algorithms within a room and between rooms. Based on these datasets we generate maps and trajectory data, which is then fed into evaluation algorithms. The map** and evaluation procedures are made in a very easily reproducible manner for maximum comparability. In the end we can draw a couple of conclusions about the tested SLAM algorithms.
△ Less
Submitted 24 August, 2019; v1 submitted 23 May, 2019;
originally announced May 2019.