Few Shot Comic Character Re-identification
Published in Under Review, 2025
Authors
Mahdi Kanani, Ramin Toosi
Abstract
Recognizing stylized cartoon and comic characters across images is an important step for content retrieval, recommendation, and copyright monitoring. However, most existing benchmarks and methods are tailored for natural images or face crops and do not support few-shot full-body re-identification. To bridge this gap, this paper presents a synthetic, identity-level benchmark and a specialized metric-learning framework for few-shot re-identification of cartoon characters. We create a dataset of 2,233 identities using a text-to-image model, yielding 12,431 images. Using this dataset, we train a variational encoder-decoder network whose mean latent vector, after L2 normalization, serves as an identity embedding. The network’s training involves a hybrid loss combining batch-hard triplet, reconstruction, and Kullback-Leibler divergence terms to promote discriminative feature separation and latent regularity. During testing, class prototypes are formed from small support sets, and query images are verified using cosine similarity and adaptive thresholding. Experiments demonstrate that our hybrid loss provides stable optimization and effective identity separation, while the equal-error rate decreases as the number of reference images per identity increases and remains within the 0.5–1.2% range across few-shot configurations, which outperform current state-of-the-art methods.
