2021年3月30日火曜日

Using the MIT App Inventor FaceExtension (for Facemesh)

 【abstract】The MIT App Inventor FaceExtension has been released [1][2]. This makes it very easy to create a smartphone app that uses Google's facial recognition (450 major points) learning model Facemesh. Let's make three simple apps right away.

[Exercise-1] On/Off by opening and closing the mouth considering face orientation
 You can see what I want to make by watching the animated gif below. Let's make this little app using the FaceExtension of MIT App Inventor.

 My answer is shown in Figure 1. Here's a complete App Inventor source program that works. I think this program is simple and very easy to understand. Notice that the opening and closing of the mouth is detected by the difference in the y-coordinate between upper and lower lips, and the orientation of the face is detected by the difference in the x-coordinates between the two points on the nose. They are grouped into a block (named whenFaceExtension.FaceUpdated) and it is activated in real time when the face or mouth moves.


    As a simple extension, try putting on and taking off the mask as follows.

[Exercise-2] - Identify "Yes","No", and "N/A" by face movement
  In the [Exercise-1] above, we have detected the current specific "state" of the face. On the other hand, you may want to get a "continuous movement" of your face. For example, "nod", "shake your face", and "tilt your face". To detect these, you will need to know the direction in which the face sways (up/down, left/right, diagonal) and its amplitude. You can do this in the App Inventor (with Facemesh extension) world as well. The following GIF shows an example of the answer. Three amplitudes in the figure are normalized, then, the movement that gives the maximum value is adopted.

Yes:Nod -> Fluctuation of the top of the nose up and down (y-axis direction)
No:Shake -> Fluctuation of the top of the nose right and left (x-axis direction)
N/A:Tilt -> Angle variation of the line segment between the two points on the forehead against x-axis



[Exercise-3] - Visualize the transition of the mouth-shaped area during utterance
 One of the great features of MIT App Inventor is the ease of integration with the GUI. Fig.4 is an app that makes the best use of it. Facemesh can detect the main 450 points on the face, but here we will use only 4 points around the mouth. The simpler the better. Graph the transition of the area of this diamond-shaped mouth (blue line) and the rate of change of the area (red line). This is an exercise that you can enjoy watching the movement of the mouth and its waveform. Play may give some hints.

 In order to properly distinguish, for example, "apple" from "orange" using such mouth movements, acceleration waveforms of mouth area change may also be needed. Hundreds of such waveforms (or their frequency analysis results) should be prepared for each utterance type. And thorough training (learning) using them will be necessary.

References
[1] Facemesh Filter Camera
http://appinventor.mit.edu/explore/resources/ai/facemesh
[2] Artificial Intelligence with MIT App Inventor,
http://appinventor.mit.edu/explore/ai-with-mit-app-inventor

0 件のコメント:

コメントを投稿