Sidahmed Omer Sidahmedeltoum2022-08-242022-08-242022-03http://hdl.handle.net/123456789/17291A Thesis Submitted in Partial Fulfillment for the Requirements of M.SC Degree in Computer ScienceAbstract String matching plays a major role in our day to day life be it in word processing, signal processing, data communication or bioinformatics. Approximate string matching is a variation of exact string matching that demands more complex algorithms. As the name suggests, in approximate matching, strings are matched on the basis of their non-exact similarities. The problem is how to optimize the computation time of approximate string matching algorithm fromminutesto milliseconds , and reducing the number of matched fragments for the transfer step, as well as on handling various issues regarding the specific Arabic.The main Objectiveis optimization of Fuzzy String Matching Algorithmsusing TF-IDF (term's frequency- inverse documentfrequency)andKNNs(K-nearestneighbors), to Enhancing the computation time of fuzzy string efficiency and techniques are utilized and a TF-IDF and KNN algorithms is proposed to get the optimizedDescriptive analytical is method to description and analysis the algorithms ,describe fuzzy string matching or computation technique and describe the implementation of TF-IDF and KNN. The results of this study isoptimizing the computation time of fuzzy string matching from minutesto millisecondsusing TF-IDF and KNNs,and show ratio of similaritybetween patterns.‫اٌسالسً‬ ‫ِطاتمح‬ ‫ذٍعة‬ ‫اٌسٍسٍح‬ ‫ِطاتمح‬ ‫اٌذٍ‪ٌٛ‬ح‪.‬‬ ‫اٌّعٍ‪ِٛ‬اذٍح‬ ‫أ‪ٚ‬‬ ‫اٌثٍأاخ‬ ‫اذصاالخ‬ ‫أ‪ٚ‬‬ ‫اإلشاساخ‬ ‫ِعاٌجح‬ ‫أ‪ٚ‬‬ ‫ذعمٍذًا‪.‬‬ ‫أوثش‬ ‫خ‪ٛ‬اسصٍِاخ‬ ‫ذرطٍة‬ ‫اٌرً‬ ‫اٌذلٍمح‬ ‫اٌسٍسٍح‬ ‫ِطاتمح‬ ‫ِٓ‬ ‫ِخرٍف‬ ‫ٔ‪ٛ‬ع‬ ‫ً٘‬ ‫اٌرمشٌثٍح‬ ‫غٍش‬ ‫اٌرشاتٗ‬ ‫أساط‬ ‫عٍى‬ ‫اٌسالسً‬ ‫ِطاتمح‬ ‫ذرُ‬ ‫اٌغاِضح‬ ‫اٌّطاتمح‬ ‫فً‬ ‫سُ‬ ‫اال‬ ‫ٌ‪ٛ‬دً‬ ‫وّا‬ ‫اٌرمشٌثٍح‬ ‫اٌسٍسٍح‬ ‫ِطاتمح‬ ‫خ‪ٛ‬اسصٍِح‬ ‫دساب‬ ‫‪ٚ‬لد‬ ‫ذذسٍٓ‬ ‫وٍفٍح‬ ‫فً‬ ‫اٌّشىٍح‬ ‫ذىّٓ‬ ‫اٌذلٍك‪.‬‬ ‫ِعاٌجح‬ ‫فً‬ ‫‪ٚ‬وزٌه‬ ‫إٌمً‬ ‫ٌخط‪ٛ‬ج‬ ‫اٌّرطاتمح‬ ‫األجضاء‬ ‫عذد‬ ‫‪ٚ‬ذمًٍٍ‬ ‫ثأٍح‬ ‫ًٍٍِ‬ ‫إٌى‬ ‫دلائك‬ ‫ِٓ‬ ‫ذذسٍٕخ‪ٛ‬اسصٍِاخ‬ ‫٘‪ٛ‬‬ ‫اٌشئٍسً‬ ‫اٌ‪ٙ‬ذف‬ ‫‪.‬‬ ‫ذذذٌذا‬ ‫اٌعشتٍح‬ ‫تاٌٍغح‬ ‫اٌّرعٍمح‬ ‫اٌّخرٍفح‬ ‫اٌّشىالخ‬ ‫)‬ ‫‪KNNs‬‬ ‫‪ٚ‬‬ ‫ٌٍّصطٍخ)‬ ‫اٌعىسً‬ ‫اٌ‪ٛ‬ثٍمح‬ ‫(ذشدد‬ ‫‪TF-IDF‬‬ ‫تاسرخذاَ‬ ‫اٌغاِضح‬ ‫اٌسٍسٍح‬ ‫ِطاتمح‬ ‫‪ٚ‬ذمرشح‬ ‫ِمرشدح‬ ‫‪ٚ‬ذمٍٕاخ‬ ‫اٌغاِضح‬ ‫اٌسٍسٍح‬ ‫وفاءج‬ ‫دساب‬ ‫‪ٚ‬لد‬ ‫ٌرذسٍٓ‬ ‫)‬ ‫جٍشأاأللشب‬ ‫اٌ‬ ‫طشٌمح‬ ‫٘‪ٛ‬‬ ‫األِثً‬ ‫اٌ‪ٛ‬صفً‬ ‫اٌرذًٍٍٍ‬ ‫عٍى‬ ‫ٌٍذص‪ٛ‬ي‬ ‫‪KNN‬‬ ‫‪ٚ‬‬ ‫‪TF-IDF‬‬ ‫خ‪ٛ‬اسصٍِاخ‬ ‫اٌذساب‬ ‫ذمٍٕح‬ ‫أ‪ٚ‬‬ ‫اٌغاِضح‬ ‫اٌسٍسٍح‬ ‫ِطاتمح‬ ‫‪ٚٚ‬صف‬ ‫‪ٚ‬ذذٍٍٍ‪ٙ‬ا‬ ‫اٌخ‪ٛ‬اسصٍِاخ‬ ‫ٌ‪ٛ‬صف‬ ‫دساب‬ ‫‪ٚ‬لد‬ ‫ذذسٍٓ‬ ‫عٍى‬ ‫اٌذساسح‬ ‫٘زٖ‬ ‫ٔرائج‬ ‫ذعًّ‬ ‫‪ٚ‬‬ ‫‪.KNN‬‬ ‫‪ٚ‬‬ ‫‪TF-IDF‬‬ ‫ذٕفٍز‬ ‫‪ٚٚ‬صف‬ ‫‪KNNs‬‬ ‫‪ٚ‬‬ ‫‪TF-IDF‬‬ ‫تاسرخذاَ‬ ‫ثأٍح‬ ‫ًٍٍِ‬ ‫إٌى‬ ‫دلائك‬ ‫ِٓ‬ ‫اٌضثاتٍح‬ ‫اٌسٍسٍح‬ ‫ِطاتمح‬ ‫‪.‬‬ ‫األّٔاط‬ ‫تٍٓ‬ ‫اٌرشاتٗ‬ ‫ٔسثح‬ ‫‪ٚ‬إظ‪ٙ‬اس‬AlgorithmsThe Optimization of Fuzzy String Matching Algorithms Using ( term's frequency- inverse document frequency ) and ( K-nearest neighbors )Thesis