The Optimization of Fuzzy String Matching Algorithms Using ( term's frequency- inverse document frequency ) and ( K-nearest neighbors )
Files
Date
2022-03
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Al-Neelain University
Abstract
Abstract
String matching plays a major role in our day to day life be it in word
processing, signal processing, data communication or bioinformatics.
Approximate string matching is a variation of exact string matching that
demands more complex algorithms. As the name suggests, in approximate
matching, strings are matched on the basis of their non-exact similarities.
The problem is how to optimize the computation time of approximate string
matching algorithm fromminutesto milliseconds , and reducing the number
of matched fragments for the transfer step, as well as on handling various
issues regarding the specific Arabic.The main Objectiveis optimization of
Fuzzy String Matching Algorithmsusing TF-IDF (term's frequency- inverse
documentfrequency)andKNNs(K-nearestneighbors), to Enhancing the
computation time of fuzzy string efficiency and techniques are utilized and
a TF-IDF and KNN algorithms is proposed to get the optimizedDescriptive
analytical is method to description and analysis the algorithms ,describe
fuzzy string matching or computation technique and describe the
implementation of TF-IDF and KNN. The results of this study isoptimizing
the computation time of fuzzy string matching from minutesto
millisecondsusing TF-IDF and KNNs,and show ratio of similaritybetween
patterns.اٌسالسً ِطاتمح ذٍعة
اٌسٍسٍح ِطاتمح اٌذٌٍٛح. اٌّعٍِٛاذٍح أٚ اٌثٍأاخ اذصاالخ أٚ اإلشاساخ ِعاٌجح أٚ
ذعمٍذًا. أوثش خٛاسصٍِاخ ذرطٍة اٌرً اٌذلٍمح اٌسٍسٍح ِطاتمح ِٓ ِخرٍف ٔٛع ً٘ اٌرمشٌثٍح
غٍش اٌرشاتٗ أساط عٍى اٌسالسً ِطاتمح ذرُ اٌغاِضح اٌّطاتمح فً سُ اال ٌٛدً وّا
اٌرمشٌثٍح اٌسٍسٍح ِطاتمح خٛاسصٍِح دساب ٚلد ذذسٍٓ وٍفٍح فً اٌّشىٍح ذىّٓ اٌذلٍك.
ِعاٌجح فً ٚوزٌه إٌمً ٌخطٛج اٌّرطاتمح األجضاء عذد ٚذمًٍٍ ثأٍح ًٍٍِ إٌى دلائك ِٓ
ذذسٍٕخٛاسصٍِاخ ٘ٛ اٌشئٍسً اٌٙذف . ذذذٌذا اٌعشتٍح تاٌٍغح اٌّرعٍمح اٌّخرٍفح اٌّشىالخ
) KNNs ٚ ٌٍّصطٍخ) اٌعىسً اٌٛثٍمح (ذشدد TF-IDF تاسرخذاَ اٌغاِضح اٌسٍسٍح ِطاتمح
ٚذمرشح ِمرشدح ٚذمٍٕاخ اٌغاِضح اٌسٍسٍح وفاءج دساب ٚلد ٌرذسٍٓ ) جٍشأاأللشب اٌ
طشٌمح ٘ٛ األِثً اٌٛصفً اٌرذًٍٍٍ عٍى ٌٍذصٛي KNN ٚ TF-IDF خٛاسصٍِاخ
اٌذساب ذمٍٕح أٚ اٌغاِضح اٌسٍسٍح ِطاتمح ٚٚصف ٚذذٍٍٍٙا اٌخٛاسصٍِاخ ٌٛصف
دساب ٚلد ذذسٍٓ عٍى اٌذساسح ٘زٖ ٔرائج ذعًّ ٚ .KNN ٚ TF-IDF ذٕفٍز ٚٚصف
KNNs ٚ TF-IDF تاسرخذاَ ثأٍح ًٍٍِ إٌى دلائك ِٓ اٌضثاتٍح اٌسٍسٍح ِطاتمح
. األّٔاط تٍٓ اٌرشاتٗ ٔسثح ٚإظٙاس
Description
A Thesis Submitted in Partial Fulfillment for the Requirements
of M.SC Degree in Computer Science
Keywords
Algorithms
