Monk - Nuggets (examples of data analysis)

Last update: Wed May 5 10:32:11 CEST 2021
  • Expected number of human labels given pattern distance from class centroid

    This graph shows the distribution of number of human labels expected in the harvest,
    given the distance of a word or character sample from the corresponding class centroid.
    At this point in time, Febr. 24, 2017, a total of 257576 human-labeled/human-confirmed
    images was harvested over the collections at that time. At a pattern distance of 0.1 and below,
    at least 50 human-based labels can be expected. For a lifelong machine learning engine
    such as Monk, the challenge is to attract the labelers to prospect samples that help the
    learning process to enter a snowball avalanche of label collection.

    For a general discussion of this topic, see:

