extract text from image

It's not random, though you don't know the algorithm it uses (it's top left to bottom right, column by column).

You will need to find each line and extract it before you label. See attached demo, which will produce this:

There is not OCR package built in to MATLAB. You'll have to find one or do it yourself.
20 Comments
Image Analyst on 22 Dec 2013

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard

By the way, you should probably start officially Accepting some of your old questions so we know you're done with them. You haven't accepted a single question yet.

Waheed Ullah on 2 Jan 2016

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard

Its a good code for text extraction but respected sir I want to extract faces from an image, so please share that code. Thanks.

Image Analyst on 2 Jan 2016

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard malik nouman on 17 Apr 2016

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard Edited: malik nouman on 17 Apr 2016

respected sir i want to extract text from natural images will this code work? plz suggest any technique for text extraction.

Image Analyst on 17 Apr 2016

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard Srika on 22 Apr 2016

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard

respected sir.i need to save this segmented text line in seperate folder. while i tried with my code.I got this error

Error using horzcat Dimensions of matrices being concatenated are not consistent. Error in textext (line 70) imwrite(thisLine,['Datasets/',num2str(thisLine),'.jpg']); Image Analyst on 22 Apr 2016

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard whos thisLine fprintf( 'thisLine = %s\n' , thisLine); fileName = sprintf( 'Datasets/%s.jpg' , thisLine) whos fileName fprintf( 'fileName = %s\n' , fileName); imwrite(thisLine, fileName); What do you see in the command window? Srika on 3 May 2016

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard Sir can i get any other images for this code? Image Analyst on 3 May 2016

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard Try a web search for images that have text in them. Srika on 4 May 2016

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard i had searched for images.Where text is not segmented in all images by this code. SATISH KUMAR on 20 Jan 2017

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard Edited: SATISH KUMAR on 20 Jan 2017

this code is not working with my 1024x1024 block of document image. this is my document image. is there any changes to be made to above code so that i can extract the words from my document image.the characters are extracting but i need words and text line also.

Image Analyst on 20 Jan 2017

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard

OK Satish - fess up. You didn't even try to adjust the thresholds for your image, did you? When you do that, it works fine. Adjusted code is attached.

SATISH KUMAR on 23 Feb 2017

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard thank you Sir, Walter Roberson on 12 Mar 2017

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard oza san comments to Image Analyst:

Dear sir here i am again to ask you some questions having the code in the link how i can make it for multilingual and multi font texts can i use it as it is??and can i use it for also artificial texts.i know i can but i have a little doubt.

Image Analyst on 12 Mar 2017

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard

If it doesn't work for your language, then you'll just have to tweak it. There's no guarantee that it works right out of the box for every language.

oza san on 12 Mar 2017

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard

it works fine but it looks like size dependent. i have to adjust every time i gave images.am i right or there is another way to make it size independent.

Walter Roberson on 12 Mar 2017

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard oza san comments to me:

Dear Walter R. here you post my questions due to i think it doesn't make sense.here is what i want to ask i want to detect segment and identify scripts from video images using the code in the link http://www.mathworks.com/help/vision/examples/automatically-detect-and-recognize-text-in-natural-images.html?prodcode=VP&language=en..so can it handle what i state in the original question.. mind you if you can help me.

Walter Roberson on 12 Mar 2017

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard

oza san: you should only "flag" something if you need to bring it to the attention of the moderators, such as if it is spam or contains inappropriate language. Otherwise you should just add a comment. I moved your remarks out of "flags" into comments.

Walter Roberson on 12 Mar 2017

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard Can it handle what you asked before? No. oza san on 12 Mar 2017

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard

ok.sorry for the mistake and thank you for your constructive comments but i test it with some images with multilingual texts and works fine with some errors.what i face is when the fonts are different it is too much poor.is there anyway that helps me to handle it?

More Answers (3)

deeksha h r on 13 Aug 2016

Direct link to this answer

Cancel Copy to Clipboard

Direct link to this answer

Cancel Copy to Clipboard

respected sir,the code used is displaying segmented letter but we need segmented line to be displayed ie each text line from the image is to be displayed.can you please help us with this code

0 Comments
Arvinder singh on 1 Nov 2017

Direct link to this answer

Cancel Copy to Clipboard

Direct link to this answer

Cancel Copy to Clipboard
0 Comments
Sharad Sirsat on 23 Nov 2019

Direct link to this answer

Cancel Copy to Clipboard

Direct link to this answer

Cancel Copy to Clipboard

You dont need to used " imagen = bwareaopen(imagen,30);" over here, Simply after converting image into binary,Use " bwlabel" to count number of character/objects in an image,After that in an image you can find the centroid of each character and then using Regionprops find their Length, Width to create a bounding box around character.Once you have done bounding box you can crop character individually by giving original image input.

Note: For centroid, bounding box(regionprops) and for imagecrop apply forloop till the counted no of objects/characters.

1 Comment
Image Analyst on 23 Nov 2019

Direct link to this comment

Cancel Copy to Clipboard

Direct link to this comment

Cancel Copy to Clipboard

But if you don't use bwareaopen(), you'll get a lot of small noise specks that are not letters. Who wants to deal with those.

Plus, if you don't do it like I did, line-by-line, then you'll have labels that are randomly (almost) chosen from each line. To avoid that you'd have to go through some special code, like get the y coordinates of each blob, and use something like kmeans() to find out which blobs are in the same line of text.