-
FontCLIP: A Semantic Typography Visual-Language Model for Multilingual Font Applications
Authors:
Yuki Tatsukawa,
I-Chao Shen,
Anran Qi,
Yuki Koyama,
Takeo Igarashi,
Ariel Shamir
Abstract:
Acquiring the desired font for various design tasks can be challenging and requires professional typographic knowledge. While previous font retrieval or generation works have alleviated some of these difficulties, they often lack support for multiple languages and semantic attributes beyond the training data domains. To solve this problem, we present FontCLIP: a model that connects the semantic un…
▽ More
Acquiring the desired font for various design tasks can be challenging and requires professional typographic knowledge. While previous font retrieval or generation works have alleviated some of these difficulties, they often lack support for multiple languages and semantic attributes beyond the training data domains. To solve this problem, we present FontCLIP: a model that connects the semantic understanding of a large vision-language model with typographical knowledge. We integrate typography-specific knowledge into the comprehensive vision-language knowledge of a pretrained CLIP model through a novel finetuning approach. We propose to use a compound descriptive prompt that encapsulates adaptively sampled attributes from a font attribute dataset focusing on Roman alphabet characters. FontCLIP's semantic typographic latent space demonstrates two unprecedented generalization abilities. First, FontCLIP generalizes to different languages including Chinese, Japanese, and Korean (CJK), capturing the typographical features of fonts across different languages, even though it was only finetuned using fonts of Roman characters. Second, FontCLIP can recognize the semantic attributes that are not presented in the training data. FontCLIP's dual-modality and generalization abilities enable multilingual and cross-lingual font retrieval and letter shape optimization, reducing the burden of obtaining desired fonts.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
PerfectTailor: Scale-Preserving 2D Pattern Adjustment Driven by 3D Garment Editing
Authors:
Anran Qi,
Takeo Igarashi
Abstract:
We address the problem of modifying a given well-designed 2D sewing pattern to accommodate garment edits in the 3D space. Existing methods usually adjust the sewing pattern by applying uniform flattening to the 3D garment. The problems are twofold: first, it ignores local scaling of the 2D sewing pattern such as shrinking ribs of cuffs; second, it does not respect the implicit design rules and con…
▽ More
We address the problem of modifying a given well-designed 2D sewing pattern to accommodate garment edits in the 3D space. Existing methods usually adjust the sewing pattern by applying uniform flattening to the 3D garment. The problems are twofold: first, it ignores local scaling of the 2D sewing pattern such as shrinking ribs of cuffs; second, it does not respect the implicit design rules and conventions of the industry, such as the use of straight edges for simplicity and precision in sewing. To address those problems, we present a pattern adjustment method that considers the non-uniform local scaling of the 2D sewing pattern by utilizing the intrinsic scale matrix. In addition, we preserve the original boundary shape by an as-similar-as-possible geometric constraint when desirable. We build a prototype with a set of commonly used alteration operations and showcase the capability of our method via a number of alteration examples throughout the paper.
△ Less
Submitted 16 December, 2023; v1 submitted 12 December, 2023;
originally announced December 2023.
-
PersonalTailor: Personalizing 2D Pattern Design from 3D Garment Point Clouds
Authors:
Sauradip Nag,
Anran Qi,
Xiatian Zhu,
Ariel Shamir
Abstract:
Garment pattern design aims to convert a 3D garment to the corresponding 2D panels and their sewing structure. Existing methods rely either on template fitting with heuristics and prior assumptions, or on model learning with complicated shape parameterization. Importantly, both approaches do not allow for personalization of the output garment, which today has increasing demands. To fill this deman…
▽ More
Garment pattern design aims to convert a 3D garment to the corresponding 2D panels and their sewing structure. Existing methods rely either on template fitting with heuristics and prior assumptions, or on model learning with complicated shape parameterization. Importantly, both approaches do not allow for personalization of the output garment, which today has increasing demands. To fill this demand, we introduce PersonalTailor: a personalized 2D pattern design method, where the user can input specific constraints or demands (in language or sketch) for personal 2D panel fabrication from 3D point clouds. PersonalTailor first learns a multi-modal panel embeddings based on unsupervised cross-modal association and attentive fusion. It then predicts a binary panel masks individually using a transformer encoder-decoder framework. Extensive experiments show that our PersonalTailor excels on both personalized and standard pattern fabrication tasks.
△ Less
Submitted 11 August, 2023; v1 submitted 16 March, 2023;
originally announced March 2023.
-
How Far Can I Go ? : A Self-Supervised Approach for Deterministic Video Depth Forecasting
Authors:
Sauradip Nag,
Nisarg Shah,
Anran Qi,
Raghavendra Ramachandra
Abstract:
In this paper we present a novel self-supervised method to anticipate the depth estimate for a future, unobserved real-world urban scene. This work is the first to explore self-supervised learning for estimation of monocular depth of future unobserved frames of a video. Existing works rely on a large number of annotated samples to generate the probabilistic prediction of depth for unseen frames. H…
▽ More
In this paper we present a novel self-supervised method to anticipate the depth estimate for a future, unobserved real-world urban scene. This work is the first to explore self-supervised learning for estimation of monocular depth of future unobserved frames of a video. Existing works rely on a large number of annotated samples to generate the probabilistic prediction of depth for unseen frames. However, this makes it unrealistic due to its requirement for large amount of annotated depth samples of video. In addition, the probabilistic nature of the case, where one past can have multiple future outcomes often leads to incorrect depth estimates. Unlike previous methods, we model the depth estimation of the unobserved frame as a view-synthesis problem, which treats the depth estimate of the unseen video frame as an auxiliary task while synthesizing back the views using learned pose. This approach is not only cost effective - we do not use any ground truth depth for training (hence practical) but also deterministic (a sequence of past frames map to an immediate future). To address this task we first develop a novel depth forecasting network DeFNet which estimates depth of unobserved future by forecasting latent features. Second, we develop a channel-attention based pose estimation network that estimates the pose of the unobserved frame. Using this learned pose, estimated depth map is reconstructed back into the image domain, thus forming a self-supervised solution. Our proposed approach shows significant improvements in Abs Rel metric compared to state-of-the-art alternatives on both short and mid-term forecasting setting, benchmarked on KITTI and Cityscapes. Code is available at https://github.com/sauradip/depthForecasting
△ Less
Submitted 8 July, 2022; v1 submitted 1 July, 2022;
originally announced July 2022.
-
One Sketch for All: One-Shot Personalized Sketch Segmentation
Authors:
Anran Qi,
Yulia Gryaditskaya,
Tao Xiang,
Yi-Zhe Song
Abstract:
We present the first one-shot personalized sketch segmentation method. We aim to segment all sketches belonging to the same category provisioned with a single sketch with a given part annotation while (i) preserving the parts semantics embedded in the exemplar, and (ii) being robust to input style and abstraction. We refer to this scenario as personalized. With that, we importantly enable a much-d…
▽ More
We present the first one-shot personalized sketch segmentation method. We aim to segment all sketches belonging to the same category provisioned with a single sketch with a given part annotation while (i) preserving the parts semantics embedded in the exemplar, and (ii) being robust to input style and abstraction. We refer to this scenario as personalized. With that, we importantly enable a much-desired personalization capability for downstream fine-grained sketch analysis tasks. To train a robust segmentation module, we deform the exemplar sketch to each of the available sketches of the same category. Our method generalizes to sketches not observed during training. Our central contribution is a sketch-specific hierarchical deformation network. Given a multi-level sketch-strokes encoding obtained via a graph convolutional network, our method estimates rigid-body transformation from the target to the exemplar, on the upper level. Finer deformation from the exemplar to the globally warped target sketch is further obtained through stroke-wise deformations, on the lower level. Both levels of deformation are guided by mean squared distances between the keypoints learned without supervision, ensuring that the stroke semantics are preserved. We evaluate our method against the state-of-the-art segmentation and perceptual grou** baselines re-purposed for the one-shot setting and against two few-shot 3D shape segmentation methods. We show that our method outperforms all the alternatives by more than $10\%$ on average. Ablation studies further demonstrate that our method is robust to personalization: changes in input part semantics and style differences.
△ Less
Submitted 24 March, 2022; v1 submitted 20 December, 2021;
originally announced December 2021.
-
Graph Fourier Transform Based on $\ell_1$ Norm Variation Minimization
Authors:
Lihua Yang,
Anna Qi,
Chao Huang,
Jianfeng Huang
Abstract:
The definition of the graph Fourier transform is a fundamental issue in graph signal processing. Conventional graph Fourier transform is defined through the eigenvectors of the graph Laplacian matrix, which minimize the $\ell_2$ norm signal variation. However, the computation of Laplacian eigenvectors is expensive when the graph is large. In this paper, we propose an alternative definition of grap…
▽ More
The definition of the graph Fourier transform is a fundamental issue in graph signal processing. Conventional graph Fourier transform is defined through the eigenvectors of the graph Laplacian matrix, which minimize the $\ell_2$ norm signal variation. However, the computation of Laplacian eigenvectors is expensive when the graph is large. In this paper, we propose an alternative definition of graph Fourier transform based on the $\ell_1$ norm variation minimization. We obtain a necessary condition satisfied by the $\ell_1$ Fourier basis, and provide a fast greedy algorithm to approximate the $\ell_1$ Fourier basis. Numerical experiments show the effectiveness of the greedy algorithm. Moreover, the Fourier transform under the greedy basis demonstrates a similar rate of decay to that of Laplacian basis for simulated or real signals.
△ Less
Submitted 19 August, 2019;
originally announced August 2019.
-
Fast Authentication and Progressive Authorization in Large-Scale IoT: How to Leverage AI for Security Enhancement?
Authors:
He Fang,
Angie Qi,
Xianbin Wang
Abstract:
Security provisioning has become the most important design consideration for large-scale Internet of Things (IoT) systems due to their critical roles to support diverse vertical applications by connecting heterogenous devices, machines and industry processes. Conventional authentication and authorization schemes are insufficient in dealing the emerging IoT security challenges due to their reliance…
▽ More
Security provisioning has become the most important design consideration for large-scale Internet of Things (IoT) systems due to their critical roles to support diverse vertical applications by connecting heterogenous devices, machines and industry processes. Conventional authentication and authorization schemes are insufficient in dealing the emerging IoT security challenges due to their reliance on both static digital mechanisms and computational complexity for improving security level. Furthermore, the isolated security designs for different layers and link segments while ignoring the overall protection lead to cascaded security risks as well as growing communication latency and overhead. In this article, we envision new artificial intelligence (AI) enabled security provisioning approaches to overcome these issues while achieving fast authentication and progressive authorization. To be more specific, a lightweight intelligent authentication approach is developed by exploring machine learning at the gateway to identify the access time slots or frequencies of resource-constraint devices. Then we propose a holistic authentication and authorization approach, where online machine learning and trust management are adopted for analyzing the complex dynamic environment and achieving adaptive access control. These new AI enabled approaches establish the connections between transceivers quickly and enhance security progressively, so that communication latency can be reduced and security risks are well-controlled in large-scale IoT. Finally, we outline several areas for AI-enabled security provisioning for future researches.
△ Less
Submitted 28 July, 2019;
originally announced July 2019.
-
Bayesian Nonexhaustive Learning for Online Discovery and Modeling of Emerging Classes
Authors:
Murat Dundar,
Ferit Akova,
Alan Qi,
Bartek Rajwa
Abstract:
We present a framework for online inference in the presence of a nonexhaustively defined set of classes that incorporates supervised classification with class discovery and modeling. A Dirichlet process prior (DPP) model defined over class distributions ensures that both known and unknown class distributions originate according to a common base distribution. In an attempt to automatically discover…
▽ More
We present a framework for online inference in the presence of a nonexhaustively defined set of classes that incorporates supervised classification with class discovery and modeling. A Dirichlet process prior (DPP) model defined over class distributions ensures that both known and unknown class distributions originate according to a common base distribution. In an attempt to automatically discover potentially interesting class formations, the prior model is coupled with a suitably chosen data model, and sequential Monte Carlo sampling is used to perform online inference. Our research is driven by a biodetection application, where a new class of pathogen may suddenly appear, and the rapid increase in the number of samples originating from this class indicates the onset of an outbreak.
△ Less
Submitted 18 June, 2012;
originally announced June 2012.