OCR & TTS in Matlab



Comments



Description

CHAPTER ONEINTRODUCTION 1.1 Project Overview : This project will demonstrate a kind of editing of both image, text , and voice technologies. The user will be able to output the text that is contained in an image or written in the editor and read this text by using the speech recognition. Also the ability of having an edited text in a file format of editing and save this file in a specific place under the name of recent documents that you got from this editor. This project will explore these ideas by developing Optical Character Recognition (OCR) software, and then demonstrating that software through a basic implementation of a text to Speech conversion system . The system will load an image of any type of format, extract the text founded in this image , and then read this text and sore this edited text in a file. Also the user can write or copy and paste a text on the editor directly. 1.2 Problem : Because of the high speed of information technology in the world , there is a strong connection between technology and the other fields in our life. Technology , software and hardware , are used in many places by different age slides of the community, adults and children, but the main problem is that there is a specific slide of people gets a difficulty in dealing with technology. This slide is blind people. So our project came to help this slide of community by making a conversion of edited text into speech to be listened by the blind people. 1 Also the another aim of making our project is that there is many images contained text which sometimes the user need it to his different purpose. In this case , our project helps the user to get this text , contained in an image , by using the technique of Optical Character Recognition (OCR). 1.3 Objectives : A full realization of this concept would involve a few distinct steps :   To develop a text from an image by OCR system. To develop text recognition software that can be gotten from an image  or even directory written into text editor system. To develop a read the text contained in the text editor by using Speech  Recognition System. To develop the above system to exist on a programmable OCR such that it operates independently of an external computing source, and interacts with its software inputs and outputs independently. Such a system would be integrated in the user’s sources, use speakers in the computer as output sources, and would issue control files to software already installed in the computer. There are different significant factors to be considered while designing both Optical Character Recognition and Text to speech systems that will produce clear text and speech outputs. 1.4 Introduction To OCR : The goal of Optical Character Recognition (OCR) is to classify optical patterns (often contained in a digital image) corresponding to alphanumeric or other characters. The process of OCR involves several steps including segmentation, feature extraction, and classification. Each of these steps is a field unto itself, and is described briefly here in the context of a Matlab implementation of OCR. 2 1.5 Text-to-Speech Software : A Text-To-Speech (TTS) recognition is computer based system that should be able to read any text aloud, whether it was straight bring in the computer by an operator or scanned and submitted to an Optical Character Recognition system. In the context of TTS synthesis, it is very complicated to record and accumulate all the words of the language. So it is in effect more appropriate to define TTS as the automatic production of speech by using the concept of grapheme and phonemes text of the sentences to complete. 1.6 Project Methodologies : 1.6.1 OCR Methodology : OCR software has been around as long as computers have to connect the printed world with the electronic one. Traditional document imaging methods use templates and algorithms in a two-dimensional environment to recognize objects and patterns. OCR methods today recognize a spectrum of colors, and they can distinguish between the background and the forefront in documents. They de-skew, de-speckle and use 3-D image correction in order to work with lower resolution images taken from mediums such as faxes, the internet and cell phone cameras. OCR software uses two different kinds of optical character recognition: feature extraction and matrix matching. Feature extraction recognizes shapes using statistical and mathematical techniques to detect edges, corners and ridges in a text font to identify the letters in a word, sentence and paragraph. OCR software achieves the best results when the image has the following conditions:  Is a clean, straight image.  Uses a very distinguishable font such as Arial or Helvetica.  Uses black letters on a clear background for better results.  Has at least 300 dpi resolution. 3 However, these conditions are not always possible. The best OCR techniques can still read words accurately in less ideal circumstances using matrix matching. One example of OCR is shown below. A portion of a scanned image of text, borrowed from the web, is shown along with the corresponding (human recognized) characters from that text. Figure 1.1 : Scanned image of text and its corresponding recognized representation. 1.6.2 Text to Speech Methodology : A Text-To-Speech (TTS) recognition is computer based system that should be able to read any text aloud, whether it was straight bring in the computer by an operator or scanned and submitted to an Optical Character Recognition system. In the context of TTS synthesis, it is very complicated to record and accumulate all the words of the language. So it is in effect more appropriate to define TTS as the automatic production of speech by using the concept of grapheme and phonemes text of the sentences to complete. 4 Figure 1.2 : TTS System. 1.7 Speech Synthesis : Synthesized speech can be created by concatenating part of recorded speech which is stored in a database. The power of a speech synthesizer is moderator by its similarity to the human being voice, and by its ability to be understood. The mainly significant qualities of a speech synthesis system are naturalness and Intelligibility. Naturalness expresses how intimately the output sounds like human speech, whereas intelligibility is the easiness with which the output is understood. The perfect speech synthesizer is providing both natural and intelligible speech hence speech synthesis systems usually try to maximize both characteristics. There are different significant factors to be considered while designing a Text to speech system that will produce clear speech. 5 1 Text To Speech System : TTS Synthesizer is a computer based system that should be understand any text clearly whether it was establish in the computer by an operator or scanned and submitted to an Optical Character Recognition (OCR) system.7. The two primary methods for producing synthetic speech waveforms are concatenative synthesis and formant synthesis. 1. Most important workings of text to speech system are Text processing and Speech production. Concatenative synthesis is stand on the concatenation 6 . The intention of a text to speech system is to convert an random given wording into a speak waveform. We are used Concatenative synthesis for our TTS.Figure 1.3 : Flowchart of Text to Speech Recognition. 1.7. the idea of the speech generation component is to synthesize the acoustic waveform Speech generation has been attempted by concatenating the recorded words . preprocessing and tokenization. Recent state of art language synthesis produces natural sounding speech by using huge amount of speech pieces. Storage of huge number of pieces and their retrieval in real time is feasible due to availability of cheap memory and computation power. Figure 1. we have developed a phonetic based text to speech synthesis system. In this system. 1.of piece of recorded words. This procedure is called text normalization. Usually concatenative synthesis constructs the most normal sounding synthesized words.7. The following figure shows the block diagram for TTs system . generation of speech database and criteria for selection of a unit. 7 .4 : Block Diagram for Text to speech Synthesis.2 Speech Generation Component : Given order of phonemes. The problem related to the unit selection speech synthesis system are consider in three things that are choice of unit size.3 Speech Synthesis Process : This TTS system is able to read any written text. We can improve the speech quality using matlab language . and in the industry. Although the models seemed intuitive from a high level perspective they quickly grew in complexity as we got closer to implementation. As we found out with our research.9 MATLAB Overview : Matlab is widely used in all areas of applied mathematics.8 Speech Synthesis Technology : Research in the area of speech synthesis has been going on for decades. in education and research at universities. 1. numerous models and theories exist for the best way implementing a speech synthesis system. Matlab has powerful graphic tools and can produce nice pictures 8 . Matlab stands for MATrix LABoratory and the software is built up around vectors and matrices. This makes the software particularly useful for linear algebra but matlab is also a great tool for solving algebraic and differential equations and for numerical integration.Figure 1. 1.5 : Flow chart for TTS with example. and science. engineering. Typical uses include:       Math and computation Algorithm development Modeling. development. Matlab is the tool of choice for high-productivity research. etc. and prototyping Data analysis. Matlab has evolved over a period of years with input from many users. simulation. In industry. 9 . and programming in an easy-to-use environment where problems and solutions are expressed in familiar mathematical notation. Matlab was originally written to provide easy access to matrix software developed by the LINPACK and EISPACK projects. which together represent the state-of-the-art in software for matrix computation. in a fraction of the time it would take to write a program in a scalar noninteractive language such as C or Fortran. especially those with matrix and vector formulations. and analysis. it is the standard instructional tool for introductory and advanced courses in mathematics. exploration.in both 2D and 3D. It is also a programming language. and visualization Scientific and engineering graphics Application development. including Graphical User Interface building. image processing. Matlab also has some tool boxes useful for signal processing. In university environments. This allows you to solve many technical computing problems. and is one of the easiest programming languages for writing mathematical programs. Matlab is a high-performance language for technical computing. optimization. It integrates computation. Matlabis an interactive system whose basic data element is an array that does not require dimensioning. visualization. the chairman of the computer science department at the University of New Mexico. an engineer. matlab was rewritten to use a newer set of libraries for matrix manipulation. It is now also used in education. simulation. started developing matlab in the late 1970s. Little's specialty. LAPACK. Very important to most users of matlab. In 2000. Toolboxes are comprehensive collections of matlab functions (M-files) that extend the matlab environment to solve particular classes of problems. He designed it to give his students access to LINPACK and EISPACK without them having to learn Fortran. These rewritten libraries were known as JACKPAC. When capitalized.10 History of Matlab : Cleve Moler. any database management system (DBMS) that can respond to queries from client machines formatted in the SQL language. 1. and many others. It soon spread to other universities and found a strong audience within the applied mathematics community. toolboxes allow you to learn and apply specialized technology. but quickly spread to many other domains.Matlab features a family of application-specific solutions called toolboxes. 1. control systems. Recognizing its commercial potential. fuzzy logic. and is popular amongst scientists involved in image processing. was exposed to it during a visit Moler made to Stanford University in 1983. Areas in which toolboxes are available include signal processing. They rewrote matlab in C and founded MathWorks in 1984 to continue its development. Jack Little. neural networks. he joined with Moler and Steve Bangert. Matlab was first adopted by researchers and practitioners in control engineering. in particular the teaching of linear algebra and numerical analysis.11 SQL Server Overview : Generically. wavelets. the term generally refers to either of two database 10 . Sybase further developed its database engine to run on Windows NT (Sybase System 10 and now System 11). Microsoft abandoned OS/2 in favor of its new network operating system. Various versions of SQL are used in today's database engines.2.0—then SQL 11 .21.12 The History of SQL Server : IBM invented a computer language back in the 1970s designed specifically for database queries called SEQUEL.management products from Sybase and Microsoft. At that point. which has more details on the language and its usage. Microsoft SQL Server uses a version called Transact-SQL. Microsoft decided to further develop the SQL Server engine for Windows NT by itself. Windows NT Advanced Server. Both companies offer client-server DBMS products called SQL Server. After Microsoft and Sybase parted ways. Because of this heritage you can pronounce it as "sequel" or spell it out as "S-Q-L" when talking about it. Sams Publishing also has a book titled Teach Yourself Transact-SQL in 21 Days. Over time the language has been added to. which stood for Structured English Query Language. the emphasis in this book is on installing. The resulting product was Microsoft SQL Server 4. and connecting to SQL Server. When Microsoft and IBM split. IBM released SEQUEL into the public domain. which was updated to 4. Microsoft initially developed SQL Server (a database product that understands the SQL language) with Sybase Corporation for use on the IBM OS/2 platform. 1. maintaining. Although you will use Transact-SQL in this book and learn the basics of the language. and Microsoft developed SQL Server 6. where it became known as SQL. so that it is not just a language for queries but can also be used to build databases and manage security of the database engine. The included utilities. and your SQL Server performance degrades rapidly as you add more users.0 on a Windows 9x system. Windows NT security. the tools run as applications. Although you can run SQL Server 7. and much more. operate from the client side of Windows NT Server or NT Workstation. which also ran on top of Windows NT. Threads originating from a service are automatically given a higher priority than threads originating from an application.0 now runs on Windows NT as well as on Windows 95 and Windows 98.Server 6. SQL Server 7. When running it on the Windows 9x platform.0 is implemented as a service on either NT Workstation or NT Server (which makes it run on the server side of Windows NT) and as an application on Windows 95/98. you do not get all the functionality of SQL Server. 1. Windows NT has other advantages as well. NTFS (New Technology File System) volumes. just like all other applications on Windows 9x. The NT platform is designed to support multiple users. Services also have a generic application programming interface (API) that can be controlled programmatically.5.0 on Windows NT rather than on Windows 9x.13 SQL Server 2008 R2 : Microsoft SQL Server 2008 R2 is the most advanced. and scalable data platform released to date. such as the SQL Server Enterprise Manager. We strongly urge you to use SQL Server 7. Of course. Windows 9x is not designed this way. SQL Server 7. you lose the capability to use multiple processors. trusted. Building on the success of the original SQL Server 2008 12 . A service is an application NT can start when booting up that adds functionality to the server side of NT. release. SQL Server 2008. bolstering efficiency and collaboration between database administrators (DBAs) and application developers. SQL Server 2008 R2 has made an impact on organizations worldwide with its groundbreaking capabilities. and editions from a DBA’s perspective. and scaling to accommodate the most demanding data workloads. CHAPTER TWO PROJECT ANALYSIS 2. This chapter introduced the new SQL Server 2008 R2 features.1 The Classification Process : 13 . Last. It also discusses why Windows Server 2008 R2 is recommended as the underlying operating system for deploying SQL Server 2008 R2. empowering end users through selfservice business intelligence (BI). capabilities. b. Pre-processing – Processes the data so it is in a suitable form for use. These steps can be broken down further into sub-steps : 1. can use the majority operator.There are two steps in building a classifier. Pre-processing. 14 . Training : a. 2. c. Morphological Operators – Remove isolated specks and holes in characters.2 OCR – Pre-processing : These are the pre-processing steps often performed in OCR :   Binarization – Usually presented with a grayscale image. b. binarization is then simply a matter of choosing a threshold value. Classification – Compare feature vectors to the various models and find the closest match. One can use a distance measure.1 : The pattern classification process. 2. Feature extraction – Reduce the amount of data by extracting relevant information—Usually results in a vector of scalar values. Feature extraction – (both same as above). Testing : a. Figure 2. training and testing. need to estimate a model (usually statistical) for each class of the training data. Model Estimation – from the finite set of feature vectors. c. Segmentation is by far the most important aspect of the pre-processing stage. Segmentation – Check connectivity of shapes. e. Skewness 15 . It allows the recognizer to extract features from each individual character. Orientation (angle of major axis) 4. the useful features for recognition are : 1. Difficulties with characters that aren’t connected.g. the segmentation problem becomes much more difficult as letters tend to be connected to each other. or a colon (. Eccentricity (ratio of major to minor axis) ii. Total mass (number of pixels in a binarized character) 2. and isolate. 2. a semicolon. label. Can use Matlab 6. Elliptical parameters i. or :).3 OCR – Feature extraction : Given a segmented (isolated) character. The 2-D moments of the character are: From the moments we can compute features like: 1.Center of mass 3. Moment based features : Think of each character as a Notepad. the letter i. Centroid .1’s bwlabel and regionprops functions. In the more complicated case of handwritten text. 4 OCR . where the labels correspond to the particular classes that the characters belong to.5. Higher order moments 2. Hough and Chain code transform 3. For example. 16 . Plotting each character class as a function of the two features we have: Figure 2. Kurtosis 6.2 : Character classes plotted as a function of two features.Model Estimation : Given labeled sets of features for many characters. Fourier transform and series 2. suppose we compute two features for each realization of the characters 0 through 9. we wish to estimate a statistical model for each character class. 17 .3 : Flowchart of recognizing words The Optical Character Recognition deals with recognition of optically processed characters.Figure 2. Reliably interpreting text from real-world photos is a challenging problem due to variations in environmental factors even it becomes easier using the best open source OCR engine. CHAPTER THREE PROJECT DESIGN  The project Design with the GUI (Graphical User Interface) : Figure 3.  Load Image : 18 .1 : The main GUI of the project. 2 : Loading an image from computer into the application. 19 .Figure 3.'Enable'. address = cat(2. imagen=imread(address).tif'}.imageInfo.img_display). set(handles.jpg'. axes(handles.[pathname]).'*.'String'. imagesc(img).path.'String'.gif'.[filename]).'on').'Please wait.filename).'*.'Enable'.'on'). set(handles. set(handles.path.  The matlab code : [filename.'on').btnConvert. set(handles. h = waitbar(0. % Show image imshow(imagen).14).'Visible'. 'Pick an Image File'). if (filename==0) warndlg('You did not selected any file ') .filename]). % fille is not selected end img=imread([pathname. set(handles. set(handles..pathname.img_display.bmp'.'on').'FontSize'. for step = 1:steps % computations take place here waitbar(step / steps) end close(h) set(handles.'). pathname] = uigetfile({'*.'Enable'. steps = 100.'*.text1..text1. D=imread('letters_num bers\D.bmp').  Create Templates : %CREATE TEMPLATES %Letter clc. C=imread('letters_numbers\C. close all.bmp'). 20 . K=imread('letters_numbers\K. E=imread('letters_numbers\E.bmp').bmp'). Recognize Text : "In Folder " letters_numbers Figure 3.F=imread('letters_num bers\F. G=imread('letters_numbers\G.L=imread('letters_num bers\L.H=imread('letters_num bers\H.bmp').bmp').bmp').bmp').bmp').J=imread('letters_num bers\J.bmp').B=imread('letters_num bers\B. I=imread('letters_numbers\I. A=imread('letters_numbers\A.bmp').bmp').3 : Recognize text pattern. png').png'). Q=imread('letters_numbers\Q.bmp'). S=imread('letters_numbers\S. %lower case letters a=imread('letters_numbers\a.png'). Y=imread('letters_numbers\Y.png'). q=imread('letters_numbers\q.p=imread('letters_num bers\p.bmp').b=imread('letters_num bers\b.V=imread('letters_num bers\V. s=imread('letters_numbers\s.png').bmp').x=imread('letters_num bers\x.png').bmp').png').png').bmp').f=imread('letters_num bers\f. m=imread('letters_numbers\m.T=imread('letters_num bers\T.png').png').png'). U=imread('letters_numbers\U.M=imread('letters_numbers\M. g=imread('letters_numbers\g.bmp').bmp'). u=imread('letters_numbers\u.X=imread('letters_num bers\X.png').v=imread('letters_num bers\v.h=imread('letters_num bers\h.bmp'). o=imread('letters_numbers\o.bmp').png').d=imread('letters_num bers\d.l=imread('letters_num bers\l.bmp').n=imread('letters_num bers\n.bmp').Z=imread('letters_num bers\Z.png').bmp').R=imread('letters_num bers\R.png'). i=imread('letters_numbers\i. w=imread('letters_numbers\w.r=imread('letters_num bers\r.png'). O=imread('letters_numbers\O.png').j=imread('letters_num bers\j.bmp').png').bmp').png').png'). 21 .png'). W=imread('letters_numbers\W. c=imread('letters_numbers\c. k=imread('letters_numbers\k.png').P=imread('letters_num bers\P.png').t=imread('letters_num bers\t.png').N=imread('letters_num bers\N. e=imread('letters_numbers\e. ... nine=imread('letters_numbers\9.bmp'). character=[letter number lowercase]. 24 24 24 24 24 24 24 ...png'). 24 24 24 24 24 24 24 24 .z=imread('letters_num bers\z.bmp'). save ('templates'...bmp'). six seven eight nine zero]... %*-*-*-*-*-*-*-*-*-*-*letter=[A B C D E F G H I J K L M. N O P Q R S T U V W X Y Z]. seven=imread('letters_numbers\7.y=imread('letters_numbers\y.bmp')... three=imread('letters_numbers\3..bmp'). %Number one=imread('letters_numbers\1.bmp').'templates') clear all 22 ..bmp').eight=imread('let ters_numbers\8..42. 24 24 24 24 24 24 24 ...four=imread('lett ers_numbers\4. 24 24]). zero=imread('letters_numbers\0. l m n o p q r s t u v w x y z]. templates=mat2cell(character..bmp'). five=imread('letters_numbers\5. lowercase = [a b c d e f g h i j k . two=imread('letters_numbers\2. 24 24 24 24 24 24 24 24 ..bmp')... 24 24 24 24 24 24 24 24 . 24 24 24 24 24 24 24 .png')... 24 24 24 24 24 24 24 24 . number=[one two three four five. six=imread('letters_numbers\6.[24 24 24 24 24 24 24 .bmp'). n}.imagn). elseif vd==6 letter='F'. comp=[comp sem]. elseif vd==5 letter='E'. end %pause(1) vd=find(comp==max(comp)). % letter=read_letter(imagn) %load templates global templates comp=[ ]. elseif vd==3 letter='C'. for n=1:num_letras sem=corr2(templates{1. Read Letter : %function read_letter function letter=read_letter(imagn. elseif vd==2 letter='B'.bmp'). % Size of 'imagn' must be 42 x 24 pixels % Example: % imagn=imread('D. elseif vd==4 letter='D'. elseif vd==7 23 .num_letras) % Computes the correlation between template and input image % and its output is a string containing the letter. %*-*-*-*-*-*-*-*-*-*-*-*-*if vd==1 letter='A'. letter='G'. elseif vd==20 letter='T'. elseif vd==11 letter='K'. elseif vd==19 letter='S'. elseif vd==16 letter='P'. elseif vd==24 letter='X'. elseif vd==12 letter='L'. elseif vd==13 letter='M'. elseif vd==8 letter='H'. %*-*-*-*-* 24 . elseif vd==14 letter='N'. elseif vd==9 letter='I'. elseif vd==21 letter='U'. elseif vd==26 letter='Z'. elseif vd==10 letter='J'. elseif vd==23 letter='W'. elseif vd==17 letter='Q'. elseif vd==18 letter='R'. elseif vd==25 letter='Y'. elseif vd==15 letter='O'. elseif vd==22 letter='V'. elseif vd==36 letter='0'. elseif vd==32 letter='6'. elseif vd==33 letter='7'. elseif vd==35 letter='9'. elseif vd==43 letter='g'. elseif vd==46 25 . elseif vd==38 letter='b'. elseif vd==41 letter='e'. elseif vd==34 letter='8'. elseif vd==40 letter='d'. elseif vd==39 letter='c'. elseif vd==30 letter='4'. elseif vd==31 letter='5'. elseif vd==29 letter='3'. elseif vd==44 letter='h'. elseif vd==42 letter='f'. elseif vd==28 letter='2'.elseif vd==27 letter='1'. elseif vd==45 letter='i'. %******** elseif vd==37 letter='a'. elseif vd==53 letter='q'. elseif vd==55 letter='s'. else letter='l'. elseif vd==47 letter='k'. elseif vd==57 letter='u'. elseif vd==58 letter='v'. elseif vd==48 letter='l'. elseif vd==59 letter='w'. elseif vd==62 letter='z'. elseif vd==50 letter='n'. elseif vd==61 letter='y'. elseif vd==49 letter='m'.letter='j'. elseif vd==52 letter='p'. %*-*-*-*-* End 26 . elseif vd==51 letter='o'. elseif vd==54 letter='r'. elseif vd==56 letter='t'. elseif vd==60 letter='x'. 2).2). %title('line sent in the function letter').imshow(nm).1.imshow(rm).imshow(fl).%Only one line.imshow(re). end end function img_out=clip(img_in) 27 .s:end).1). % First letter matrix %figure. space = size(rm. %*-*-*Uncomment lines below to see the result*%subplot(2. %figure.2)-size(re.s)). if sum_col==0 k = 'true'.% Remaining line matrix %figure. num_filas=size(im_texto. rm=im_texto(:. Lettere crope : %function letter_in_a_line function [fl re space]=letter_crop(im_texto) % Divide letters in lines im_texto=clip(im_texto).2). fl = clip(nm). re=[ ]. %title('first letter in the function letter_in_a_line').1:s-1).1. %pause(1). re=clip(rm). sum_col = sum(im_texto(:. %pause(1). for s=1:num_filas s. %title('remaining letters in the function letter_in_a_line'). break else fl=im_texto. %subplot(2. %pause(1). space = 0.imshow(im_texto). nm=im_texto(:. min(c):max(c)). % First line matrix rm=im_texto(s:end. for s=1:num_filas if sum(im_texto(s. % subplot(2.title('REMAIN LINES') im_texto=clip(im_texto). % [fl re]=lines(im_texto).imshow(re). re=clip(rm).imshow(im_texto).title('FIRST LINE') % subplot(3.1.imshow(fl). fl->first line.  Lines Crop : function [fl re]=lines(im_texto) % Divide text in lines % im_texto->input image.1).% Remain line matrix fl = clip(nm).:))==0 nm=im_texto(1:s-1.2).1. num_filas=size(im_texto. break else fl=im_texto.1. :).1.%Crops image 28 .3). % subplot(3.2).title('INPUT IMAGE') % subplot(3. re=[ ]. end end function img_out=clip(img_in) [f c]=find(img_in).min(c):max(c)). img_out=img_in(min(f):max(f).%Only one line.1). %*-*-*Uncomment lines below to see the result**-*-*% subplot(2.imshow(re).imshow(fl).1.jpg'). :). re->remain line % Example: % im_texto=imread('TEST_3.[f c]=find(img_in).1). img_out=img_in(min(f):max(f). % Remove all object containing fewer than 30 pixels imagen = bwareaopen(imagen. address = cat(2.threshold). % --.4 : Recognize text in the project.path. imagen =~im2bw(imagen.text1. filename=get(handles.3)==3 %RGB image imagen=rgb2gray(imagen).filename). end % Convert to BW threshold = graythresh(imagen). handles) % hObject handle to btnConvert (see GCBO) % eventdata reserved .'String').pathname. eventdata. imagen=imread(address).Executes on button press in btnConvert.to be defined in a future version of MATLAB % handles structure with handles and user data (see GUIDATA) % Convert to gray scale pathname=get(handles. function btnConvert_Callback(hObject. if size(imagen.Figure 3.'String').30). %Storage matrix word from image 29 . 5) img_r = imresize(fc. spaces betweeen % to compute the total % adjacent letter rc = fl.2). %Uncomment line below to see lines one by one %figure.%Storage matrix word from image word=[ ]. imgn=fl. while 1 %Fcn 'letter_crop' separate letters in a line [fc rc space]=letter_crop(rc). %fc = first letter in the line %rc = remaining cropped line %space = space between the letter % cropped and the next letter %uncomment below line to see letters one by one %figure. while 1 %Fcn 'lines' separate lines in text [fl re]=lines(re). n=0. text=[ ].imshow(fc).imshow(fl). %resize letter so that correlation %can be performed 30 . re=imagen. % Load templates load templates global templates % Compute the number of letters in template file num_letras=size(templates.pause(2) %-------------------------------------------------spacevector = [].pause(0. text=''.[42 24]). end end %breaks loop when there are no %-------------------------------------------------% max_space = max(spacevector). no_spaces = 0.%Write 'word' in text file (lower) %fprintf(fid.75 * max_space) no_spaces = no_spaces + 1.'%s\n'. for m = x:n word(n+x-m+no_spaces)=word(n+xm+no_spaces-1). spacevector(n)=space.'%s\n'. %Fcn 'read_letter' correlates the cropped letter with the images %given in the folder 'letters_numbers' letter = read_letter(img_r. if isempty(rc) more characters break. for x= 1:n %loop to introduce space at requisite locations if spacevector(x+no_spaces)> (0.lower(word)).num_letras). spacevector = [0 spacevector].%Write 'word' in text file (upper) text = char(text.word).n = n + 1. word). %letter concatenation word = [word letter]. 31 . end word(x+no_spaces) = ' '. end end %fprintf(fid. 24).'Please wait.5 : Save to Notepad file format. guidata(hObject.. set(handles.  Save to NotePad : Figure 3. for step = 1:steps % computations take place here waitbar(step / steps) end close(h) set(handles. steps = 100. set(handles. %*When the sentences finish.'). handles). breaks the loop if isempty(re) %See variable 're' in Fcn 'lines' break end end h = waitbar(0.'Enable'.'on').text2.'String'.Speak.text). 32 ..% Clear 'word' variable word=[ ].text2.'FontSize'. 'txt'.'file'). switch button 33 . Figure 3.Executes on button press in btnOpen. filename=strcat(fname.txt as file for write fname=get(handles. % --.'Cancle'). function btnOk_Callback(hObject.. handles) % hObject handle to btnOk (see GCBO) % eventdata reserved .text2. eventdata.edit_location.filename).% --. pathname=get(handles.Executes on button press in btnOk.2) button = questdlg('file name already exist '. function btnOpen_Callback(hObject.'Cancle'.6 : Saving a text file. handles) % hObject handle to btnOpen (see GCBO) % eventdata reserved .to be defined in a future version of MATLAB % handles structure with handles and user data (see GUIDATA) value=get(handles.'.edit_name.'String'). filepath=fullfile(pathname. if isequal(exist(filepath. .'String').. setappdata(0. eventdata.txt'). 'Warning'.'String').'Override'.value) file_fig().to be defined in a future version of MATLAB % handles structure with handles and user data (see GUIDATA) %Opens text. '%s\n'.'%s\n'. if nRows>1 for k=1:nRows fprintf(fid.:)..txt(k.:)). date1=date.') decr=NaN.lower(word)). end h = waitbar(0.'txt'). 1) . steps = 100.32.. end 34 .txt(k. end else fid = fopen(filepath.'%s\n'.edit_note.'Write Note here . decr=get(handles.%Write 'word' in text file (lower) txt=getappdata(0. 'wt').'String').').10). end else fprintf(fid. stxt=''.case 'Override' fid = fopen(filepath.%Write 'word' in text file (upper) stxt=strcat(stxt. rmappdata(0. 'wt'). end fclose(fid).'Please wait. for step = 1:steps % computations take place here waitbar(step / steps) end close(h) %fprintf(fid.. nRows = size(txt. case 'Cancle' return. stxt=txt.'txt').. if strcmp(decr.txt). data1). % --. handles) 35 .%data1 = cell(1. data1={handles.'text'.columns. %Open 'text. conn = database('dbFiles'.7 : Edited text in a Notepad file format.'path_file'.'sa'.'File_Data'.8 : Loading a text file (Notepad file format). insert(conn.  Load Text File : Figure 3. handles).lastid fname stxt pathname date1 decr}.'time'.Executes on button press in load.txt' file winopen(filepath) close Figure 3. eventdata.'name_file'. columns={'id'. function load_Callback(hObject.'no te'}. close(conn) % Update handles structure guidata(hObject.'123').6). 'Please wait.1).to be defined in a future version of MATLAB % handles structure with handles and user data (see GUIDATA) [filename. end %# remove empty entries in s txt(lineCt:end) = [].').'String'.'on') fclose(fid)  Loading file in edit tool : 36 . %# grow s if necessary if lineCt > sizS txt = [txt.'r').'Enable'.% hObject handle to load (see GCBO) % eventdata reserved .'select txt file'). for step = 1:steps % computations take place here waitbar(step / steps) end close(h). h = waitbar(0.'. sizS = sizS + 10000. lineCt = lineCt + 1..txt) set(handles. while ischar(tline) txt{lineCt} = tline. tline = fgetl(fid).filename).Speak. fid = fopen(filepath. steps = 100.text2.1)]. filepath=fullfile(pathname.cell(10000.pathname] = uigetfile('*. set(handles. lineCt = 1.txt. %# preassign s to some large cell array txt=cell(10000. sizS = 10000.. end tline = fgetl(fid). 37 . handles) % hObject handle to Speak (see GCBO) % eventdata reserved . Speaker = System.Executes on button press in Speak.:). eventdata.addAssembly('System.Speech').'cell') rwtxt = {rwtxt}.text2. 1) .Figure 3. for n=1:nRows rwtxt=text(n.SpeechSynthesizer.Synthesis.Speech.  Text To Speech : % --.'String').to be defined in a future version of MATLAB % handles structure with handles and user data (see GUIDATA) text=get(handles. function Speak_Callback(hObject. nRows = size(text. if isempty(text) text = 'Write something to speak'.9 : Loading a text of notepad file format in the edit tool. end try NET. if ~isa(rwtxt. end for k=1:length(rwtxt) Speaker.11 : Some data in a database table.10 : File data. Microsoft SQL Server ODBC in Matlab for Windows : 38 . Some Data in a Table : Figure 3.Speak (rwtxt{k}). end  Design DataBase (using SQL Srver 2008 R) : Table Name : File_Data : Figure 3. end end catch warning(['Not working !!']).  On Opening Form : 39 .13 : List of text in the database.Figure 3.12 : Database explorer in matlab.  List of Text in Database : Figure 3. if ~isequal('No Data'.a) set(handles. else set(handles.'String'.edit_date.1) set(handles.'There is no note'). curs = exec(conn.edit_note.'String'.'cellarray') curs=fetch(curs).a(:.1)) set(handles. % hObject handle to figure % eventdata reserved . conn = database('dbFiles'.edit_text.edit_name.'String'.'sa'. setdbprefs('DataReturnFormat'.['select * from File_Data']).4)) set(handles.3)) if isempty(a(1.a(1.edit_id.'String'.2)) set(handles.listbox1.'String'.Executes just before list_files is made visible.6)).5)) set(handles.Data.edit_location.a(1.6)) set(handles. % Update handles structure guidata(hObject.edit_note. see OutputFcn. eventdata.edit=0.'123').a(1. varargin) % This function has no output args.'Value'.'String'.2)) set(handles.to be defined in a future version of MATLAB % handles structure with handles and user data (see GUIDATA) % varargin command line arguments to list_files (see VARARGIN) handles. a=curs.a(1. function list_files_OpeningFcn(hObject.'String'.% --. handles. end end % Choose default command line output for list_files handles.listbox1.  Open File : 40 . handles).output = hObject.a(1.a(1.'String'. 'Delete'. if isequal(ee. filepath=fullfile(pathname. 'wt') nRows = size(txt. 1) .edit_location.'Cancle')..edit_text.Executes on button press in btn_open.'String'). % --.14 : Open file by using notepad file format.'file'). ee=exist(filepath{1}.'String').'Create'. 'Warning'. '.edit_id. eventdata.'What you want to do?'].txt').fname). switch button case 'Create' fid = fopen(filepath{1}.'.to be defined in a future version of MATLAB % handles structure with handles and user data (see GUIDATA) id=get(handles. 41 .'Cancle'.edit_name. function btn_open_Callback(hObject..'String').char(10). fname=strcat(fname. pathname=get(handles.Figure 3. if ~isempty(id) fname=get(handles.2) winopen(filepath{1}) else button = questdlg(['filse has been damged or change it location. handles) % hObject handle to btn_open (see GCBO) % eventdata reserved . . txt=get(handles.'String'). case 'Cancle' return.Executes on button press in pushbutton5.%Write 'word' in text file (upper) end fclose(fid). handles) % hObject handle to pushbutton5 (see GCBO) % eventdata reserved . winopen(filepath{1}). .'%s\n'.txt{k. eventdata..:}).for k=1:nRows fprintf(fid. handles). function btn_edit_Callback(hObject. switch button case 'OK' btn_del_Callback(hObject. % --. 'Warning'.to be defined in a future version of MATLAB % handles GUIDATA) structure with handles and user data (see 42 . eventdata. end end end  Edit : Figure 3.'Cancle'). end case 'Cancle' return.'Cancle'..15 : Edited text in notepad file. case 'Delete' button = questdlg(['Are you sure you want to delete?'].'OK'. edit_id. handles). function btn_del_Callback(hObject.'BackgroundColor'.'There is no note') whereclause=strcat('where id='.'sa'.'on') set(handles.edit_note. if ~isempty(id) handles.edit+1. update(conn.'String'). if handles. if ~isequal(edit_txt.edit_note. end end % Update handles structure guidata(hObject.992]).'Enable'.[1.to be defined in a future version of MATLAB % handles structure with handles and user data (see GUIDATA) id=get(handles.edit_note.0 1. {edit_txt}. edit_txt=get(handles.961 0.'BackgroundColor'. helpdlg('You are Done update'.0 1.{'note'}.Executes on button press in btn_del.'String'). else handles.0]).id).edit_note. if ~isempty(id) 43 .'String').976 0.whereclause) set(handles.'BackgroundColor'.'inactive') set(handles.'Enable'. handles) % hObject handle to btn_del (see GCBO) % eventdata reserved .edit_id.edit_note.edit_note.'Enable'.'File_Data'. end  Delete From Database : % --.976 0.[0.'inactive') set(handles.edit=0.edit_note. conn = database('dbFiles'.992]).[0.961 0. eventdata.id=get(handles.'123').'Update') else set(handles.edit=handles.edit==1 set(handles. a(1.edit_id.a{1}) set(handles.1)) set(handles.'Value'.'String'.3)) if isempty(a(1.1) set(handles.['select * from File_Data']).edit_note.listbox1..'String'.'cellarray') curs=fetch(curs).6)) set(handles.'String'.a(:.a(1.'String').a(:.'String'.edit_date.'Cancle'.'There is no note').'String'.a(1. if ~isequal('No Data'.edit_name.'Cancle').2)) set(handles.1) set(handles. end else 44 .. else set(handles.edit_text.6)). curs = exec(conn. .edit_id.a(1.2)) set(handles.a(1. setdbprefs('DataReturnFormat'.4)) set(handles.'Value'.a(1.'String'.5)) set(handles.2)) set(handles.query{1}).listbox1.Data. a=curs.listbox1. 'Warning'.button = questdlg(['Are you sure you want to delete?'].'String'.'OK'. conn = database('dbFiles'.'123').'String'.edit_note.id).edit_location. switch button case 'OK' id=get(handles.'String'.'sa'.listbox1. query=strcat('delete from File_Data where id='. curs = exec(conn. edit_id.'') set(handles.'String'.'String'.'') set(handles. end end  List of files : Figure 3. % --.'String'.Executes on button press in btn_speak.edit_note.set(handles.16 : List of files.edit_text.'') set(handles.edit_date.'String'.'') set(handles. function btn_speak_Callback(hObject.'') end close(curs) close(conn) helpdlg('Delete if Done'.edit_location.listbox1.'String'.'Delete') case 'Cancle' return.'') set(handles. eventdata. handles) % hObject handle to btn_speak (see GCBO) 45 .edit_name.'') set(handles.'String'.'String'. edit_text.edit_text.'String'). 46 .17 : Returning to the main form with the text .to be defined in a future version of MATLAB % handles structure with handles and user data (see GUIDATA) text=get(handles. setappdata(0. if ~isempty(text) value=get(handles.% eventdata reserved .value) close() ocr_gui() end  Return to the main Form with the text : Figure 3.'String').'text'. handles. text=getappdata(0. see OutputFcn.'text').function ocr_gui_OpeningFcn(hObject. if ~isempty(text) set(handles.Speak.output = hObject. 47 . eventdata.'String'.'on') rmappdata(0. handles).text) set(handles. % hObject handle to figure % eventdata reserved . end % Update handles structure guidata(hObject.'text').'Enable'.to be defined in a future version of MATLAB % handles structure with handles and user data (see GUIDATA) % varargin command line arguments to ocr_gui (see VARARGIN) % Choose default command line output for ocr_gui handles.text2. varargin) % This function has no output args. 2 : Viewing the image in the program.CHAPTER FOUR IMPLEMENTATION 4. 3. 48 . jpg. View the image information by clicking the button called Image Info.1 : Loading an image into the program.1 Project Implementation : 1. The image will load . Figure 4. 2. Loading any image format (bmp. png …etc ) Figure 4. 49 . Here we also need to take care of different font sizes and small spaces between words. For instance. 7. Lines detection and removing. "l" codes and the final character code will be selected later. 5. Recognition of characters. Detection of text lines and words. "|" "1".3 : Viewing the image information. Convert the image to grayscale and binarize it using the threshold value (Otsu algorithm). So that only that portion is used for recognition and rest of the region is left out. recognition of the image of "I" character can produce "I". In this step we tried to identify the text zones present in the image. Page layout analysis. 6. an image of every character must be converted to appropriate character code. Sometimes this algorithm produces several character codes for uncertain images.Figure 4. 4. This is the main algorithm of OCR. 8. for instance.10 . Text. searchable TXT file format.4 : Recognizing text. .And store (name.Cliking on Save to Notepad Will open form to insert name and location of the file (Browse) 50 .9. path and note) of txt file in database directly . Saving results to selected output format.11 . Location. Click Recognize Text to get the text Figure 4. ask you if you want to override or cancel to rename the file Figure 4. 12.5 : Saving text in a notepad file. 51 . If the file name is already in the location you select a message will show . Click OK to open and save in a file.6 : Warning message of an exit file name .Figure 4. 52 .13 .7 : Opening a file in notepad.8 : The pattern classification process. Import text to be edited and read in the editor and to be converted into .Figure 4.voice ( text-to-speech ) conversion Figure 4. Using database to view the recent documents that have been saved by .14 .the contents text well loaded in the edit text  Figure 4.this program 53 .: When you select the file .9 : Loading the contents of the file into the edit text. 15 Figure 4. Open the text you have been saved in database in Notepad . 54 .Figure 4.11 : Opening the text of notepad file using database.10 : Viewing the recent document using the database. 55 .16 Figure 4.12 : Editing in the notepad file.You can Edit the note .. Figure 4.13 : Updating the editing. You can click on speak to load the text in main form. 18.14 : Warning message of deleting file from list. Figure 4. Absolutely you can delete from the list. 17. 56 . . First .. with the help of database programs. Saving the audio files with different types of audio file formats . Thus this system is very easy and efficient to implement unlike other methods which involve many complex algorithms and methods. we discussed the topics relevant to the development of TTS systems.. The next step in improving this system would be implementing some machine learning algorithms in order to support generalization.etc. with the help of database programs. RAW. docx. Second . MP3.  Conclusion : In this project.. making the application able to open text in different text file formats .... 57 . Finally . opening an audio file and getting the speech to text conversion of this file. Forth. VOX. We conducted MOS tests to evaluate the performance of speech synthesizer.   Suggestions for Future Work : A number of open problems must be solved to allow the development of a truly Image . we are interested to make our project more efficient and getting the use of different slides of people of the community and spreading its features globally.WAV. This paper describes the successful completion of a simple text to speech translation by simple matrix operations.etc. docx .15 : Delete done message.etc. Third.. Fifth.. we will add another feature to our project which is Speech to Text Conversion . Saving the text files with different types of text file formats. text to speech conversion and recognition system.Figure 4. These problems suggest a variety of research directions that need to be pursued to make such a system feasible. pdf . pdf . India)”. 2011 IEEE Student Conference 9 978-1-4673-0099-5 ) on 19-20 Dec. 2009. 2011 409 . Research and Development (SCOReD). Lahore. 181-185. D. [7] http://code. of the Conference on Language and Technology 2009 (CLT09). 58 .com/p/banglaocr/. Shirbahadurkar and D. 2011.google. Pakistan. 2009. Muttakinur Rahman Chowdhury and Mumit Khan.google.414 .Bormane. [8] http://code.Bormane “Subjective and Spectrogram Analysis of Speech Synthesizer for Marathi TTS Using Concatenative Synthesis. 2009. pp. M. Telecommunication and Computing. [5] http://code.” 2010 IEEE International Conference on Recent Trends in Information.D. 2009. Last accessed: May 12. pp 43-48 [3] Hamad.Balakrishanan “Speech Transaction for Blinds Using Speech-TextSpeechConversions” Advances in Computer Science and Information Technology Communications in Computer and Information Science Volume 131.” Arabic Text-To-Speech Synthesizer”. [4] S.com/p/ocropus/.S. Proc. (2009) “Marathi Language Speech Synthesizer Using Concatenative Synthesis Strategy (Spoken in Maharashtra. Abul Hasnat. Second International Conference on Machine Vision.S. [2] Johnny Kanisha and G. Last accessed: May 12. Last accessed: May 12. [6] Md.Shirbahadurkar and D. REFERENCES : [1] S.google.com/p/tesseract-ocr/. "Integrating Bangla script recognition support in Tesseract OCR".
Copyright © 2025 DOKUMEN.SITE Inc.