The development of face recognition technology and the role of adequate training data
Face recognition is a technology that is used to identify people by their faces and is a type of biometric software. It is often used in security settings, but also has other uses such as in social media and photo tagging.
In order for AI to be able to recognize a person by their face, it needs to be presented with enough training data, or data that shows the AI how to recognize people by their faces. The training data needs to be accurate, and it must be large enough to provide a large variety of examples.
Development of Face Recognition Technology
In the 1960s, David Marr presented a three-layered model to describe how the human brain processes visual information. He believed that the visual system first receives a raw image, represented at the lowest layer of the model, then performs a series of operations to process that information into a representation of the world that can be understood. This representation is then sent to higher levels of the system for higher-level processing.
In the 1980s, two researchers named Fukushima and Miyake developed a model that was similar to Marr’s, but added a fourth layer that showed how the brain combines the processed information from the first three layers to create a perception of the world.
The three-layered model of Marr and the four-layered model of Fukushima and Miyake are similar in that both describe how visual information is processed, but they are different in that Marr’s model describes the process as a series of operations, while Fukushima and Miyake’s model describes the process as a combination of the processed information from the previous layers.
How Face Recognition Technology builds on the works of Marr, Fukushima, and Miyake
Face recognition technology is based on Marr’s three-layered model of visual processing and is also based on Fukushima and Miyake’s four-layered model of visual processing.
- The first layer, the image acquisition layer, is represented by the cameras used to capture the image.
- The second layer, the image pre-processing layer, is represented by the algorithms that are used to process the image, such as edge detection and face detection.
- The third layer, the feature extraction layer, is represented by the features that are extracted from the image, such as the location of the eyes, nose, mouth, and other facial features.
- The fourth layer, the face recognition layer, is represented by the face recognition algorithms that compare the extracted features to previously trained data.
Why Quality Training Data Matters
Training data is crucial to the development of face recognition technology, but high quality is also important. If the training data is of low quality or includes a lot of errors, it will negatively affect the accuracy of the face recognition software.
Tip:
High-quality training data for teaching the algorithms of face recognition tools is orderable at clickworker in all required quantities.
Image Datasets & Photo Datasets – Learn more about the service
Training Data Requirements
The quality of the training data depends on the type of face recognition technology being used. There are three types of face recognition technology:
- Supervised learning algorithms learn from known data and use that information to predict an output. They are usually trained on a set of images of faces that have already been classified.
- Unsupervised learning algorithms are trained using unknown data and are often used to discover patterns or clusters within a set of data.
- Semi-supervised learning algorithms are trained with a combination of both known and unknown data.
The training data for each of these three types of face recognition technology must meet different requirements.
- Training data for supervised learning algorithms should be of high quality and should not contain any errors. The more accurate the training data is, the more accurate the face recognition software will be.
- Training data for unsupervised learning algorithms should be of high quality, but the accuracy of the training data is not as important as the quantity. The algorithms for unsupervised learning are used to discover patterns or clusters within a set of data, and therefore the more data that is used to train them, the better.
- Training data for semi-supervised learning algorithms require a combination of high quality and quantity. The accuracy of the training data must be high enough to provide accurate results, but the quantity of training data must also be large enough to provide enough variety of examples.
The training data for each of these three types of face recognition technology must also be relevant to the face recognition software being developed. For example, training data for face recognition software that is used in a security setting should be of high quality and include people of many different races, ages, and genders. Conversely, training data for face recognition software that is used for social media purposes should include people of many different ages and genders, but should not include people belonging to a particular race or ethnicity.