August 2019
Envision AI enables visually impaired people to read text in real time or by recognizing documents, even handwritten texts. It also has options for color identification, scene description, and product identification by barcode scanning.
This application is available for Android, requiring a version equal to or greater than 5.0, and for IOS, where at least version 10.0 is required.
Next, an analysis of the interface and the different functions of the application will be carried out, both in Android and IOS.
Interface
The interface of the Android and IOS versions differs slightly in terms of the number of tabs, their location and their content.
The first difference is that in Android the tabs are located at the top of the screen, being 3, while in IOS they are located at the bottom and are 4. The tab that the Android version does not have is called Scan and find.
Another important difference between both versions is that in Android you can activate the camera flash to improve lighting, which cannot be done in the IOS version as it lacks a button for this option.
The other significant differences are given by the buttons that exist in one version and another to activate the different functions of the application, as well as the location in one tab or another. In this sense, the first button that appears on Android but not on IOS is the Read Handwritten Text button in the Text tab. Also within the Text tab there is a button in IOS called More Actions, which allows you to recognize text from PDFs, images or multiple pages. However, the Android version lacks such a button.
A very important button for the visually impaired is Color Detect. However, this function is only present in IOS within the General tab.
As for the Scan Bar Code and Teach Envision buttons, these are present in both Android and IOS, although they are located in different tabs. In Android they are located in the General tab while in IOS they are located within the Scan and Find tab.
Finally, we must mention the Find People and Find objects options, which are present in IOS within the Scan and Find tab, but are not present in Android.
Features
Below is a detailed analysis of the different functions that Envision AI can perform, both in the Android and IOS versions.
Magnification and text functions
Land
The Magnifier feature allows you to use your device's camera as a magnifying glass. To activate this function, you have to press the button called “Magnifier”. When pressed, a slider bar is displayed to indicate from 0 to 100 the magnifying degree of the magnifying glass. To close the magnifying glass, press the button again.
Start reading instantly
To activate this option, you must go to the Text tab and press the button that says "Start reading instantly", located at the bottom left of the screen. Once the button is pressed, Envision immediately begins reading any text the camera catches. To stop instant reading you have to press the button again, which now says “Stop reading instantly”.
Read handwritten text
To activate this option, only available on Android, you have to go to the Text tab and press the button that says "Read handwritten text", located at the bottom center of the screen. After pressing it, the application begins to read the text when it is detected.
Read documents
This option allows the user to take a photograph of a text to be recognized, or for the application to automatically take a photo of it when it detects a document for recognition.
Once the text has been recognized, the application displays a screen with it. On IOS, at the bottom of the screen there are several buttons to start reading the text automatically and pause the reading, to change the size of the text and to export it. For its part, Android has similar buttons, although there are differences since there is no button to change the size of the text and two buttons are included, “Next page” and “Previous page”.
More Actions
This option, located within the Text tab, is only available in the IOS version, and allows you to read texts with multiple pages, PDFs, and images. the button to activate it is located at the bottom right of the screen and after pressing it, a submenu is displayed with the different options. If you click on "Import PDF", a browser opens to search for files on the phone. If you click on "Import image" the device's photo library opens. Tapping on "multiple pages" opens the camera to take a picture and displays an indicator showing the number of pages captured along with two more buttons, one to take a picture and one to stop taking pictures.
General functions
Describe the scene
To activate this option, located within the General tab, you must press the button with that name, located at the bottom left of the screen. After activating the function, the user has to take a picture of their surroundings and the application will describe the scene shown in it.
Detect Color
This option is located within the General tab and is available only for IOS. It allows the user to identify the colors of objects by using the camera. Once the corresponding button, located in the lower central part of the screen, is pressed, the application begins to speak the colors it detects. You can do this with a standard precision of 30 colors or with a more descriptive precision of 950 colors.
Teach Envision
Describe the scene
This option allows the application to recognize faces of friends or relatives, either by taking photos of the person, or from images saved in the phone's photo library. It is located in the General tab on Android, while on IOS it is located in the Scan and find tab. To do this, click on the button that says “Teach Envision”, located at the bottom right of the screen. If the “Show a face” option is chosen, the camera will open and you can take up to five photos of the person for Envision to recognize. If the "Open library" option is chosen, the phone's photo library will be displayed for the user to select a photo. Once the person has been recognized and a name has been assigned, when the "Describe the scene" option is clicked, if Envision recognizes a face that it has stored, when describing it it will indicate the person or people that are in it.
Scan barcode
This option is located in the General tab and allows the user to identify a product by scanning its barcode. To do this, press the button located at the bottom right in IOS and at the bottom center in Android. After pressing it, using the camera, the barcode is scanned, and if it is identified, the product name is displayed along with a button that allows more information about it.
Scan and Find Features
Find people
This option is only available on IOS and is located in the Scan and Find tab. To activate it, you have to press the button with the name “Find people” located at the bottom left of the screen. After pressing it, the user can move the mobile around, and when Envision detects a person, it will beep to indicate it.
Search for objects
This feature is only available on IOS and is located in the Scan and Find tab. To activate it, you have to press the button with the name “Search for objects”, located in the lower central part of the screen. Once this button is pressed, a list with a series of objects is displayed, such as car, cat, toothbrush, bag or fork. The user must select one of the options, and then move the mobile around it. When Envision detects an object of the type selected by the user, the application will play a sound to indicate this.
Help tab
configurations
Envision AI allows you to configure different parameters, which depend on the version used. The parameters that can be configured in IOS are more than in Android because in the first one the application has more functionality.
The possible configuration parameters are detailed below, all of which are located within the tab called Help.
Offline text recognition
This option can be enabled or disabled. When enabled, it enables faster text recognition for texts written in Latin-based languages.
Automatic language detection
This option can be enabled or disabled. When disabled, the application only reads text in the system language.
Text to speech
This option is called “Text to Speech” on Android and “Speech” on IOS, and it allows you to configure the voice that is not the screen reader.
In Android it allows you to choose the synthesizer that you want to use from those that are installed in the system.
In IOS this option allows you to adjust the speed of the voice that is not VoiceOver, as well as choose the language and the "person" who speaks.
Color detection
This setting is only available on IOS. Allows you to select the accuracy of the color identification when using the Color Detect functionality. You can choose between standard (30 colors) or descriptive (950).
Help
Within the help, you can access a series of online tutorials in English by clicking on the "Read tutorials" option, send an opinion by clicking on the "Give opinion" option or request that the developer contact the user by phone by clicking on the option “Request a call”.
Other options on the help tab
Other possible options located within the Help tab include checking the account details, signing up, sharing the app with friends, writing an Envision review and checking the version data, privacy policy and terms of use of the application.
Devices used
To carry out the tests, an iPhone SE with IOS 12.4 and a Huawei Mate P20 Lite with Android 9.0 were used. A Samsung Galaxy J3 with Android 8.0 has also been used.
It should be noted that the application does not work correctly when the Silent Mode is activated in Android, because the speech synthesis in functions such as Read instantly does not work, although the screen reader works perfectly.
Tests performed
In order to analyze and evaluate the application, a series of tests of the different options have been carried out, both in the Android and IOS versions.
The tests carried out for the recognition of texts with the different options for this purpose have been very positive in general. In both instant text reading and document recognition, the application has shown very satisfactory results, clearly identifying the texts presented. The only difficulty that the application has encountered has been with texts with multiple boxes and images, where the results obtained have not allowed a clear reading of the text. It should also be noted that the recognition of handwriting has been equally satisfactory, reading these texts without problem. Where the application has presented more difficulties has been in the recognition of PDFs and images in IOS, which may have been affected by the content of images, as well as the structure and quality of the documents.
A very noteworthy point when reading documents is that the application indicates whether or not there are visible edges. It should be noted that this function is designed to scan documents, such as letters or folios, and that these indications are very useful for people who completely lack the sense of sight or whose reduced vision does not allow them to see even the document. In this sense, in the tests carried out, the application has always identified the edges and has correctly indicated when they were within the limits of the camera and when they were not.
The "Describe the scene" option, in the tests carried out, has presented a reasonable behavior, although it has presented some failures. These have been due to the fact that some objects have not been correctly identified, although the application has indicated in these cases that the description may not be exact. You have also used a word that was not the correct word, but this has been a mistranslation of the description. It should also be noted that in the tests carried out on Android the scene was described in the phone's default language, but when updating to version 0.9.6 the scenes are described in English.
On the other hand, during the tests carried out, the identification of the color has been carried out with great precision and in a very precise way, giving a wide range of nuances to the colors when the descriptive mode is activated in the parameters.
The tests carried out on the functionality of finding people have been very positive. It is not a question of describing or identifying the person or persons, but of indicating to the user if there is a person, which is quite useful for people who cannot do it. In all cases, the application has worked correctly, emitting a sound to indicate this.
The Find Objects function has correctly identified most objects, from handbags to armchairs to laptops. It has only been unable to identify objects of the telephone category, whether they were mobile or fixed.
Teaching Envision has shown completely satisfactory results. During the tests carried out, the application has perfectly recognized the man and the woman who have entered it. When scenes have been captured, the man and woman have been easily identified, either separately or appearing together.
The negative aspects of the application, like all those that make intensive use of the camera and the Internet connection, refer to the high data and battery consumption. The first is because most features require data to be sent via an Internet connection for processing on an external server. As it is images that are sent, and surely with high resolution for a good analysis, data consumption is high. Regarding the second, since the camera is constantly working, the energy consumption of the device is very high, which means that the application quickly consumes its battery.
Finally, the magnifying glass function has been tested, which works perfectly, although it should be noted that it is likely to work better or worse, in terms of magnification, depending on the quality of the device's camera.
Conclusion
As seen in the tests carried out, Envision AI has shown great behavior with very notable results. This application, as has been verified, is designed primarily for people who are totally blind or have very low vision, since functions such as "Find people" and "Find objects" demonstrate this. This does not mean that it is not very useful for other people with low vision, since the text recognition and color identification functions, as well as the magnifying glass, are very useful for this group.
A very noteworthy aspect is that it is indicated if the edges are within the limits of the camera when recognizing a document, as it is very useful for totally blind people.
It is worth noting the possibility of choosing between the standard and descriptive mode when identifying colors. This is of great interest since users who are satisfied with a basic range of colors can activate the standard mode, while more demanding users can activate the descriptive mode for greater precision.
The voice configuration options are also very interesting. Although it seems that in IOS it is more configurable, allowing you to select the speed, the language and the "person", in Android, once the voice synthesis is selected, it can be configured from the Android settings itself.
The less positive aspects of the application are given by the high consumption of data, since the application has to connect to the Internet in most of its functions to carry out the processing of the images on an external server, as well as the high consumption battery required when the camera is permanently active.
In general, it can be concluded that it is a great application, very interesting for the group of blind and low vision people, which performs the functions it incorporates with great precision and correction.
Highlights
- Handwritten text recognition.
- Recognition of PDFs and images in IOS.
- High precision in text recognition.
- Possibility of recognizing 950 colors.
- Ability to obtain additional product information by scanning the barcode.
- Possibility to learn faces of friends and family./li>
- Possibility of configuring the voice that is not the screen reader.
Improvement points
- It could be suggested to include the functions of Detect Color, Find People and Find Objects in the Android version.
- The possibility of including a button to activate the flash in IOS could be studied.
- You could study the description of the scenes in the default language of the phone in the Android version.
- In future versions it could be studied how to reduce the battery consumption of the device.
- It could be analyzed how to reduce data consumption in future versions.