archive-org.com » ORG » I » IAPR-TC11.ORG

Total: 1082

Choose link from "Titles, links and description words view":

Or switch to "Titles and links view".
  • Changes related to "IBN SINA: A database for research on processing and understanding of Arabic manuscripts images" - TC11
    3 7 14 30 days Hide minor edits Show bots Hide anonymous users Hide logged in users Hide my edits Show new changes starting from 19 45 15 February 2016 Namespace all Main Talk User User talk TC11 TC11 talk File File talk MediaWiki MediaWiki talk Template Template talk Help Help talk Category Category talk DAS Discussion Invert selection Associated namespace Page name Show changes to pages linked to the

    Original URL path: http://www.iapr-tc11.org/mediawiki/index.php/Special:RecentChangesLinked/IBN_SINA:_A_database_for_research_on_processing_and_understanding_of_Arabic_manuscripts_images (2016-02-15)
    Open archived version from archive

  • IBN SINA: A database for research on processing and understanding of Arabic manuscripts images - TC11
    classifiers can be found in reference 3 For the skeletonization a thinning process has been used followed by a correction step to discover missed branch points Version 1 0 The first version of the dataset comprises the feature vectors of the 20722 shapes sub words extracted as connected components The feature vector consists of 92 features The features can be divided in two parts 1 8 global features and 2 84 skeleton based features The second part also can be divided in two sub parts a topological features based on the relation to the branch end singular points on the skeleton and b geometrical features related to the orientation and position of sub strokes that comprise the connected component or shape under study The feature vector is regularized in terms of its length The details can be found in section 4 of the paper 1 All non normalised feature vectors are provided in a single space delimited text file where each row corresponds to the feature vector of a single connected component Version 2 0 The extended version of the dataset includes the corresponding image data of the shapes The dataset comprises a series of MatLAB files one for each folio of the manuscript containing a structure with all the available information icluding the images and features of each sub word The metadata provided include the feature vector calculated from the skeleton of shapes 1 Detailed information about the structure of the files is provided in the guide Two editions of the dataset v2 0 are provided for greater flexibility one containing just the binarized images of the shapes small file and another containing both the original color images and the binarized ones large file The rest of the information in the both files is identical Related Ground Truth Data Version 1 0 For each shape in the dataset 15 binary labels that correspond to 15 problems are provided For each problem the label specifies whether the string of that shape contains the associate Arabic letter of that problem or not For example in Latin language for the sake of font simplicity if the string is ab for Problem of letter a the label will be 1 and for problem of letter c the label will be 1 Only 15 letters are considered here that have at least 1000 positive samples each in the dataset The 15 letters are Ein abbreviated as EU Aleph abbreviated as aL Be Abbreviated as bL Dal Abbreviated as dL Fa Abbreviated as fL Ha Abbreviated as hL Kaf Abbreviated as kL Lam Abbreviated as lL Mim Abbreviated as mL Nun Abbreviated as nL Ghain Abbreviated as qL Ra Abbreviated as rL Ta Abbreviated as tL Waw Abbreviated as vL Ya Abbreviated as yL For abbreviation of the problem names Fingilish encoding is used which corresponds each Arabic letter to an ASCII character Please see the report for the details Ground Truth Specification The ground truth data are specified as a single 20722 X 15

    Original URL path: http://www.iapr-tc11.org/mediawiki/index.php?title=IBN_SINA:_A_database_for_research_on_processing_and_understanding_of_Arabic_manuscripts_images&printable=yes (2016-02-15)
    Open archived version from archive

  • IBN SINA: A database for research on processing and understanding of Arabic manuscripts images
    a special property in this wiki 21 October 2011 09 49 03 hide properties that link here No properties link to this page Enter the name of the page to start browsing from Retrieved from http www iapr tc11 org mediawiki index php Special Browse IBN SINA A database for research on processing and understanding of Arabic manuscripts images Personal tools 91 105 69 17 Talk for this IP address

    Original URL path: http://www.iapr-tc11.org/mediawiki/index.php/Special:Browse/IBN_SINA:_A_database_for_research_on_processing_and_understanding_of_Arabic_manuscripts_images (2016-02-15)
    Open archived version from archive

  • File:Dataset OR3C Thumbnail.jpg - TC11
    2010 530 272 25 KB Dimos Talk contribs Edit this file using an external application See the setup instructions for more information File usage The following page links to this file Harbin Institute of Technology Opening Recognition Corpus for Chinese Characters HIT OR3C Retrieved from http www iapr tc11 org mediawiki index php title File Dataset OR3C Thumbnail jpg oldid 1071 Personal tools 91 105 69 17 Talk for this

    Original URL path: http://www.iapr-tc11.org/mediawiki/index.php/File:Dataset_OR3C_Thumbnail.jpg (2016-02-15)
    Open archived version from archive

  • Harbin Institute of Technology Opening Recognition Corpus for Chinese Characters (HIT-OR3C) - TC11
    information is available The characters have been collected using a handwriting pad and are recorded and labelled automatically via the handwriting document collection software OR3C Toolkit The software used to collect the characters is also made available supplied version is in Chinese The dataset is organised in 5 subsets 4 subsets of characters Digit 1 10 Letter 11 62 GB1 63 3817 GB2 3818 6825 and 1 subset of documents The 4 subsets of characters contain 6 825 classes produced by 122 subjects and 832 650 samples in total A single file per subject is provided for online data and a single file per subject for offline data see below for the file format used The different subsets are defined as index ranges within these files The document corpus corresponds to 10 news articles that contain in total 77 168 samples drawn from 2 442 classes and produced by 20 subjects The document captured data have been post processed and split into individual characters the characters resized to 128 x 128 pixels and stored sequentially in a single image and a single vector file similarly to the first four subsets The dataset contains 909 818 images The total size of the dataset is 15 5 GB 1125 Mb compressed There are three file formats defined by ourselves and introduced in the related documents The individual character images are 128 x 128 greyscale Metadata For each image a label is provided The labels of digits and letters are encoded in ASCII the labels of Chinese characters are encoded in GB2312 80 The label file is in every folder and named labels txt Related Ground Truth Data N A Related Tasks Handwriting recognition for Chinese characters References S Zhou Q Chen X Wang HIT OR3C An Opening Recognition Corpus for Chinese Characters

    Original URL path: http://www.iapr-tc11.org/mediawiki/index.php?title=Harbin_Institute_of_Technology_Opening_Recognition_Corpus_for_Chinese_Characters_%28HIT-OR3C%29&oldid=1360 (2016-02-15)
    Open archived version from archive

  • View source for Harbin Institute of Technology Opening Recognition Corpus for Chinese Characters (HIT-OR3C) - TC11
    832 650 samples in total A single file per subject is provided for online data and a single file per subject for offline data see below for the file format used The different subsets are defined as index ranges within these files The document corpus corresponds to 10 news articles that contain in total 77 168 samples drawn from 2 442 classes and produced by 20 subjects The document captured data have been post processed and split into individual characters the characters resized to 128 x 128 pixels and stored sequentially in a single image and a single vector file similarly to the first four subsets The dataset contains 909 818 images The total size of the dataset is 15 5 GB 1125 Mb compressed There are three file formats defined by ourselves and introduced in the related documents The individual character images are 128 x 128 greyscale Metadata For each image a label is provided The labels of digits and letters are encoded in ASCII the labels of Chinese characters are encoded in GB2312 80 The label file is in every folder and named labels txt Related Ground Truth Data N A Related Tasks Handwriting recognition for Chinese characters References S Zhou Q Chen X Wang HIT OR3C An Opening Recognition Corpus for Chinese Characters DAS 2010 to appear Version Correspondence border 1 Dataset Task align center V1 0 rowspan 2 align center V1 0 align center V1 1 Submitted Files Version 1 1 Files http www iapr tc11 org dataset OR3C DAS2010 v1 1 OR3C offline character rar Offline Characters 807 Mb http www iapr tc11 org dataset OR3C DAS2010 v1 1 OR3C offline document rar Offline Documents 120 Mb http www iapr tc11 org dataset OR3C DAS2010 v1 1 OR3C online character rar Online Characters 146 Mb

    Original URL path: http://www.iapr-tc11.org/mediawiki/index.php?title=Harbin_Institute_of_Technology_Opening_Recognition_Corpus_for_Chinese_Characters_%28HIT-OR3C%29&action=edit (2016-02-15)
    Open archived version from archive

  • Revision history of "Harbin Institute of Technology Opening Recognition Corpus for Chinese Characters (HIT-OR3C)" - TC11
    cur prev 14 21 18 May 2010 Masa Talk contribs 0 Version 1 0 cur prev 13 25 18 May 2010 Masa Talk contribs 25 Current Version cur prev 12 26 18 May 2010 Masa Talk contribs 12 Version 1 1 cur prev 12 25 18 May 2010 Masa Talk contribs 11 Version 1 1 cur prev 12 25 18 May 2010 Masa Talk contribs 1 Files cur prev 12 24 18 May 2010 Masa Talk contribs 12 Version 1 1 cur prev 12 24 18 May 2010 Masa Talk contribs 17 Version 1 1 cur prev 12 22 18 May 2010 Masa Talk contribs 4 Version Correspondence cur prev 12 22 18 May 2010 Masa Talk contribs 0 Version 1 1 cur prev 12 21 18 May 2010 Masa Talk contribs 0 Current Version cur prev 12 18 18 May 2010 Masa Talk contribs 71 Version 1 1 cur prev 12 17 18 May 2010 Masa Talk contribs 36 Version Correspondence cur prev 12 16 18 May 2010 Masa Talk contribs 1 463 Submitted Files cur prev 12 13 18 May 2010 Masa Talk contribs 12 Version 1 0 cur prev 12 12 18 May 2010 Masa Talk contribs 24 Version 1 0 cur prev 12 08 18 May 2010 Masa Talk contribs 55 Version 1 0 cur prev 12 03 18 May 2010 Masa Talk contribs 42 Version Correspondence cur prev 12 02 18 May 2010 Masa Talk contribs 0 Version Correspondence cur prev 12 02 18 May 2010 Masa Talk contribs 10 Version Correspondence cur prev 12 01 18 May 2010 Masa Talk contribs 49 Version Correspondence cur prev 12 00 18 May 2010 Masa Talk contribs 8 Version Correspondence cur prev 12 00 18 May 2010 Masa Talk contribs 103 cur prev 11 12 18 May

    Original URL path: http://www.iapr-tc11.org/mediawiki/index.php?title=Harbin_Institute_of_Technology_Opening_Recognition_Corpus_for_Chinese_Characters_%28HIT-OR3C%29&action=history (2016-02-15)
    Open archived version from archive

  • Pages that link to "Harbin Institute of Technology Opening Recognition Corpus for Chinese Characters (HIT-OR3C)" - TC11
    redirects The following pages link to Harbin Institute of Technology Opening Recognition Corpus for Chinese Characters HIT OR3C View previous 50 next 50 20 50 100 250 500 Handwriting recognition for Chinese characters links Harbin Institute of Technology Opening Recognition Corpus for Chinese Characters HIT OR3C redirect page links Datasets per Journal Conference links Datasets List links IAPR TC11 What s New links View previous 50 next 50 20 50

    Original URL path: http://www.iapr-tc11.org/mediawiki/index.php/Special:WhatLinksHere/Harbin_Institute_of_Technology_Opening_Recognition_Corpus_for_Chinese_Characters_%28HIT-OR3C%29 (2016-02-15)
    Open archived version from archive



  •