The purpose of this paper is developing the intellectual system of text detection and characters recognition in photographs and video of complex graphic scenes. The system includes two main parts, each of which is made on the basis of individual convolutional neural network. Subproblems, which were solved when solving text recognition problem on image, are text field location on the image, character segmentation in the text fields, and character recognition. Procedure of text field location on the image is based on two-stage scheme. In the first stage, gradient methods were used to analyze intensity drops in local areas color image (RGB) and select areas of the image, which can be text information. In the second stage, the classifier, which is built on the basis of convolution neural network with multi-scale image representation on discrete wavelet transform, is used to refine and estimate the probability of belonging text of each pixel of text fields (result in the first stage). After learning network, the text/non-text classification accuracy of the learning sample amounted to 99.3 %, and of the control sample - 77.7 %. Character segmentation will be conducted in three stages (row selection, segmentation of words, character segmentation), improved the quality image of segment characters using morphological operations (e.g. noise removal, other objects that are not associated with a symbol or have no borders), and also refined its borders. Character recognition algorithm is built on the basis of convolutional neural network. Error back-propagation algorithm was selected to learn network. After learning network, the character recognition accuracy of the learning sample was 96.88%, and of the control sample - 93%. Experimental verification of the proposed solutions confirmed their ability to detect text and recognize character in the images in conditions of complex graphic scenes, when there are many non-textual objects (e.g. people, fragments of the house …). Further improve the quality of intellectual functioning text systems can be achieved using a linguistic correction of recognized texts.