Computer Science and Engineering MS Thesis Defense by Kurmanbek Kaiyrbekov

August 29, 2018

KOÇ UNIVERSITY

GRADUATE SCHOOL OF SCIENCES & ENGINEERING

COMPUTER SCIENCE AND ENGINEERING

MS THESIS DEFENSE BY KURMANBEK KAIYRBEKOV

 

Title: Stroke-based Sketched Symbol Generation and Segmentation

 

Speaker: Kurmanbek Kaiyrbekov

 

Time: August 8, 2018, 11:00 AM

 

Place: ENG127

Koç University

Rumeli Feneri Yolu

Sariyer, Istanbul

 

Thesis Committee Members:

Assoc. Prof. T. Metin Sezgin (Advisor, Koç University)

Assoc. Prof. Engin Erzin (Koç University)

Asst. Prof. Hamdi Dibeklioğlu (Bilkent University)

 

Abstract:

Hand-drawn objects usually consist of multiple semantically meaningful parts. For example, a stick figure consists of a head, a torso, and pairs of legs and arms. The process of breaking a hand-drawn symbol into those subparts is called symbol segmentation. On the other hand, the process of drawing subparts and unifying them into a single entity is called symbol generation. Despite the fact that these two procedures are interrelated, all previous endeavors focused on a single task, either segmentation or generation. In this thesis, we propose a StrokeRNN that is a generative model based on a Variational Auto-Encoder (VAE) architecture and we extend it to recognize semantically meaningful components. While existing generative systems model complete hand-drawn objects, the StrokeRNN models their constituent strokes. Hence, in contrast to prior frameworks, the StrokeRNN is capable of drawing multiple symbol categories even when trained on a single class. To segment drawings, we adopt a memory efficient vector representation of symbols instead of the raster image format used by existing techniques. Our segmentation model is simple yet powerful neural network, it classifies stroke-level components based on the encoding generated by the StrokeRNN. Experiments show that our segmentation accuracies surpass existing methodologies on the available state of the art dataset. Furthermore, extensive evaluations on our newly annotated dataset demonstrate that our neural network obtains significantly better scores as compared to the best baseline model. We release our dataset to the community.