It is still an open problem to extract general discriminative features for age estimation while reducing the negative influence of individual differences. To address the problem we developed a new method, called locally adjusted robust regression (LARR). The idea of LARR is illustrated in Figure 1. Suppose the predicted age value by the Support Vector Regression (SVR) method is f(y), corresponding to the input data y. The point f(y) is displayed by the black dot on the regression curve. The estimated age, f(x), may be far away from the true age value, L, shown as the red dot on the true age trajectory curve. The idea of the LARR method is to slide the estimated value, f(y), up and down (corresponding to greater and smaller age values) by checking different age values, t [f(y) – d, f(y) + d], to see if it can come up with a better age estimation. The value d indicates the range of ages for local search. Hopefully the true age value, L, is also within this range, i.e., L [f(y) - d, f(y) + d].
Figure 1. Illustration of the idea of locally adjusted robust regression.
Therefore the LARR method is a two-step procedure: (1) a robust regression over all ages of the training data by using the SVR method. This step can be considered as a global regression process; (2) a local adjustment within a limited range of ages centered at the regression result.
Second, we developed a representation using biologically-inspired features (BIF) for encoding human ages from facial images. Given the superior performance of human vision on general object recognition and age estimation, it is reasonable to look to biology for inspiration to improve the computer’s performance for age estimation. This motivates us to explore features and methods from brain modeling and studies.
We proposed a new operator called “STD” that can reveal the local variations that might be important to characterize the subtlety of aging (e.g., wrinkles, creases, and eyelid bags) on faces. This can be observed in Figure 2 where the input face image is of size 60 x 60. The S1 (simple cell) units at four orientations of band 4 (filter sizes of 17x17 and 19x19) are displayed. A pooling grid of 12 x 12 is drawn in each S1 map. The local variation is significant (especially in the orientation of 45 degrees), while a pure “MAX” operation which was used in previous models, cannot reveal it. Experimentally we found that the “STD” operation outperforms the pure “MAX” pooling for age estimation.
We performed extensive age estimation experiments on a large database with 8,000 face images captured from 1,600 Asian subjects, 800 females and 800 males, in the age range from 0 to 93 years. The second database is the FG-NET Aging Database which is publicly available. The database contains 1,002 face images. There are 82 subjects in total with the age ranges from 0 to 69 years.
Figure 2. Figure 3. The S1 layer of band 4 (two scales with filter sizes of 17 x 17 and 19 x 19) at four orientations. The pooling grid size for C1 (complex cell) units is a 12 x 12 square shown in each S1 map.
The Cumulative Score (CS) curves obtained from our methods in comparison with others are shown in Figure 3. Further details about the methods can be found in [i,ii,iii,iv].
Figure 3. Our results (F1, F2, F3, and F2.F) are much better than the best published ones.
Figure 4. Age estimation on Einstein’s faces using our method. The estimated ages are reasonable.