1. Science
  2. Publications
  3. Information Processing Systems
  4. 5(103)'2012
  5. SEARCH ADOPTED INFORMATION IN THE INTERNET USING ALGORITHMS TF, LONGSENT, WINNOWING

SEARCH ADOPTED INFORMATION IN THE INTERNET USING ALGORITHMS TF, LONGSENT, WINNOWING

D.S. Glіbov, A.S. Chuprina
Annotations languages:

This paper present description of the main types of adopted text information search algorithms and the results of study TF, LongSent and Winnowing algorithms for searching near duplicates in the Internet. The feature of the study is the algorithms are used for each paragraphs of input docum`ents text separately. The quality of algorithms was appraised by metrics: accuracy, completeness, F-measure.
Keywords: duplicate, algorithm, shingle, similarity