AI

Online Cursive Handwriting Generation Using Trace Transformation and Symbol-Independent Point Classification Model

By Karina Korovai Samsung R&D Institute Ukraine

By Olga Radyvonenko Samsung R&D Institute Ukraine

By Nataliya Sakhnenko Samsung R&D Institute Ukraine

By Oleg Yakovchuk Samsung R&D Institute Ukraine

By Andrii Kroitor Samsung R&D Institute Ukraine

By Dmytro Zhelezniakov Samsung R&D Institute Ukraine

1. Introduction

Handwriting generation has been an area of active research for a long time, driven by applications in digital documents, personalized fonts, and assistive technologies. Both online (stroke-by-stroke) and offline (image-based) handwriting generation have been extensively studied [1]. They evolved toward more sophisticated architectures with a strong emphasis on disentangling style and content.

Offline methods using GANs, diffusion models, and visual transformers produce high-quality images but suffer from unrealistic spacing, background noise, inconsistent ink, and loss of subtle details. Online handwriting generation evolved from early RNN-based approaches [2] into more complex architectures [3-5]. A notable advancement was the development of models trained to disentangle style representations at both writer and character levels [6-7], but these solutions focused on Chinese writing, not addressing cursive scripts where ligatures—connecting elements between letters—are essential for Latin and Cyrillic scripts [8]. Meanwhile, existing ligature generation methods [9-10] rely on manually designed heuristics and geometric assumptions, requiring extensive tuning and failing to adapt to diverse writing styles.

As a result, current approaches cannot fully address the challenge of generating high-quality cursive handwriting that preserves individual writing styles and natural letter connections, while maintaining computational efficiency for real-time editing on mobile devices.

We propose a novel approach for generating cursive handwritten text as digital ink traces, complementing existing single-character generation methods [6-7]. Our method learns structural segmentation directly from data using a lightweight RNN classifier and applies trace transformation to seamlessly connect symbols while preserving handwriting style. This enables unified cross-language support and real-time mobile performance.

The evaluation results demonstrate the effectiveness of our approach in cursive handwriting text generation and replicating nuanced writing styles while enabling real-time responsiveness. Although tested on Latin-based languages, the method is adaptable to other scripts with connected writing.

2. Our approach

We propose a method for handwriting ligature synthesis in two steps: structural segmentation for each symbol and stroke transformation to generate ligatures. The approach pipeline is illustrated in Fig. 1.

Figure 1. The complete approach pipeline.

2.1 Head/Tail Detection

We generalize the structural segmentation step by learning it directly from data using a supervised model rather than relying on fixed rules or templates.

All characters undergo preprocessing step that includes: linear resampling (20 points), spatial normalization, and feature calculation (coordinates and a binary pen state flag). Notably, the model does not receive character labels (text codes) as input and operates solely on spatial coordinates, making it character-independent. A lightweight RNN-based pointwise classifier (about 12k parameters) with two stacked BiGRU layers predicts the class (head / body / tail / isolated) of each point based on position and context, enabling adaptation to varying handwriting styles beyond rule-based approaches.

The resulting structural labels then guide targeted stroke transformation for coherent handwriting modification.

2.2 Trace Deformation

We propose trace deformation algorithm via point-wise optimization to adjust flexible sections of handwritten characters for smooth ligature generation. The boundary points of the deformable segments remain unchanged, while the optimization balances three objectives:

Connection distance minimizes the gap between the endpoints of the transformed tail and head segments, ensuring seamless connections.

Local smoothness distance ensures smooth local transitions without unnatural bends by penalizing large deviations between consecutive points.

Displacement distance prevents excessive deviation from the original character strokes by penalizing the overall shift of each point from its original position.

The total objective is a weighted sum of these three components, with empirically selected weight coefficients to guarantee continuity, stability, and smoothness. After calculating the partial derivatives, the convex optimization problem reduces to a tridiagonal system of linear equations solvable in O(N) operations, enabling real-time performance on resource-limited devices.

3. Experiments and Results

In our study, the primary objective is to assess ligature synthesis with respect to text readability, visual appeal, and consistency with the user’s original handwriting. We evaluate our approach through quantitative analysis, qualitative assessment, and efficiency measurements on mobile devices. Experiments focused on end-to-end evaluation and are limited to English.

3.1 Quantitative analysis

We assess readability, visual appeal, and style consistency using standard metrics (Table 1). Lower Handwriting Distance (HWD) [3], Fréchet Inception Distance (FID) [4], and Kernel Inception Distance (KID) [5] values for connected text confirm improved similarity to user handwriting.

Table 1. Quantitative similarity metrics of generated handwriting with and without connections relative to user-written samples.

Readability was measured via character recognition rate (CRR) and word recognition rate (WRR) using multiple text recognition systems (Table 2). Generated text achieved higher recognition rates than originals, with only ~1% decrease after adding cursivity, demonstrating its limited impact on readability. Across all tools, readability differences between user-written and generated text remained under 5%, demonstrating style-consistent text generation.

Table 2. Readability comparison for user-written and generated text.

3.2 Qualitative evaluation

A user study with 18 participants evaluated 55 image pairs (original/generated) on readability and visual appeal (Fig. 2). For readability, 87% of generated images were non-inferior, with 36% rated better. For visual appeal, 71% of generated matched or exceeded originals. Remaining issues stemmed primarily from symbol synthesis artifacts rather than ligature generation.

Figure 2. Qualitative evaluation survey: choice distribution.

3.3 Efficiency

The performance evaluation was conducted on Samsung Galaxy S25 (CPU-only, single-thread). We assessed different stages of the handwriting generation process, beginning with symbol synthesis using two different methods, followed by head/tail detection and trace deformation steps (Table 3). Provided results for ligature generation by the proposed approach demonstrated exceptional on-device performance.

Table 3. Time used for different stages of a single text symbol generation.

4. Conclusions

The results confirm that our approach effectively generates natural cursive handwriting while preserving letter shapes. Generated text remains minimally detectable to both human reviewers and automated tools, with WRR improvements of up to 3.21%. Operating directly on raw points ensures computational efficiency suitable for real-time use on low-end mobile platforms.

The method produces generated text that can imperceptibly replace or extend the user’s handwriting, enabling applications such as on-the-fly error correction, auto-completion, and personalized content generation, with letter-level corrections possible without regenerating entire words.

Future work will focus on enhancing adaptability to diverse handwriting styles and extending support to other scripts, including right-to-left writing systems.

Link to the paper

References

1. Diaz, Moises, et al. "A survey of handwriting synthesis from 2019 to 2024: A comprehensive review." Pattern Recognition162 (2025): 111357.
2. Graves, Alex. "Generating sequences with recurrent neural networks." arXiv preprint arXiv:1308.0850 (2013).
3. Aksan, Emre, Fabrizio Pece, and Otmar Hilliges. "Deepwriting: Making digital ink editable via deep generative modeling." Proceedings of the 2018 CHI conference on human factors in computing systems. 2018.
4. Tang, Shusen, and Zhouhui Lian. "Write Like You: Synthesizing Your Cursive Online Chinese Handwriting via Metric‐based Meta Learning." Computer Graphics Forum. Vol. 40. No. 2. 2021.
5. Chang, Jen-Hao Rick, et al. "Style equalization: Unsupervised learning of controllable generative sequence models." International Conference on Machine Learning. PMLR, 2022.
6. Dai, Gang, et al. "Disentangling writer and character styles for handwriting generation." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2023.
7. Liu, Yu, et al. "Elegantly written: Disentangling writer and character styles for enhancing online Chinese handwriting." European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2024.
8. Korovai, Karina, et al. "Handwriting enhancement: recognition-based and recognition-independent approaches for on-device online handwritten text alignment." IEEE Access 12 (2024): 99334-99348.
9. Wang, Jue, et al. "Combining shape and physical modelsfor online cursive handwriting synthesis." International Journal of Document Analysis and Recognition (IJDAR) 7.4 (2005): 219-227.
10. Lin, Zhouchen, and Liang Wan. "Style-preserving english handwriting synthesis." Pattern Recognition 40.7 (2007): 2097-2109.

#ICASSP #DeepLearning

AI