2020年9月15日火曜日

Voice Calculator: well-crafted AI tutorial

Japanese (Summary)
  MIT App Inventorの高校生向けAIチュートリアル6例目として、「Voice Calculator」が追加されました。これは、音声入力「3.14掛ける8はいくら?」や「では、12の3.5倍は?」等々に対して、計算結果を音声で答えます。初歩的とはいえ、自分なりのミニ「Siri」やミニ「OK Google!」もどきなので、受講生の興味を惹くのではと思います。実際にやってみると、"こんなに簡単に!?"と思うはずです。それは、第一にApp Inventor自体の優秀さ、第二に(皆様方の実験の準備などと同じく)適切な手引きが周到に準備されているからです。そして、このアプリは一つの雛形に過ぎないので、自分でいろいろと拡張・発展を考えるよう促しています。そこがひとつ重要なのだと思います。高校生向け講座や、大学低学年のAI導入の参考になるかも知れません。大人でも、スマホで楽しめます!この優れたチュートリアルに、私が補足すべきことはないのですが、この記事では、日本語版にする場合の留意点(文字単位の検査か、あるいは正規表現利用か)を中心に書きましたので、ご参考になれば幸いです。

Creating your very first own voice calculator

    This time, a new tutorial Voice Calculator [1], mainly for high school students, has been added to the "Artificial Intelligence with MIT App Inventor" [2]. It executes basic arithmetic operations given by voice. Although it is a rudimentary application, it is a challenging task because commands are given in natural language. For example, for multiplication, various utterances as shown in Fig. 1 is allowed. Addressing these challenges will help us to understand how Alexa and Siri, which are commonly used on smartphones, understand human speech, and respond appropriately.


Fig.1 Acceptable utterance examples for multiplication

    This project makes use of speech recognition and text processing blocks built into App Inventor. And, the calculation result is responded with a synthetic voice. The explanation of the creation procedure is meticulously assembled and easy to understand. You will be given appropriate hints along the way, so you will be able to complete an app like Fig. 2 without frustration.

Fig.2 Voice Calculator made with MIT App Inventor


    I don't need to add anything to this excellent tutorial. However, there are some things to consider when creating this project in Japanese, so I will show it below.

Extracting numerical values from the uttered text

(1) Splitting at spaces
    Fig. 3 shows an example in which the utterance related to multiplication is converted into text by voice recognition. Note that in the English version numbers are separated by spaces, but in the Japanese version there are no such spaces. In the English version, you can extract numerical values (two in this case) by paying attention to space and using App Inventor's text processing "split at spaces" and "is number?" as shown in Fig.4. However, this method cannot be applied to the Japanese version because there is no such white space. 

Fig.3 Resulting text from speech recognition 


Fig.4 Extracting numbers by splitting at spaces

(2) Character by character inspection in the Japanese version
    In the Japanese version, an alternative method is required such as shown in Fig.5. Here, you need to scan the text character by character to see if it's a number or a period. This seems somewhat complicated but will be useful for beginners to practice string manipulation.

Fig.5 Extracting numbers by examining each character

(3) Using Regular Expressions
    Alternatively, you can use a regular expression search. For real numbers, give the regular expression pattern "[\d.]+" or "[0-9.]+". The simplest way is to use regular expression extension such as [3]. Fig. 6 shows an example of its use.

Fig.6 Extracting numbers using a regular expression extension

    You can also write a short JavaScript program as another way to use regular expressions. Fig.7  illustrates it. A good tutorial on calling JavaScript programs from MIT App Inventor can be found in reference [4].

Fig.7 Extracting numbers with regular expressions in JavaScript

    By the way, recently, the format of the file path given to the WebViewer URL has changed. That is, as shown in Fig.7, specify "http://localhost/..." as the file path for both debug execution and production execution. Please see Evan's posts [5][6] and related post [7] for more details.

Notice

    It has also been shown that recently developed new Text Blocks can be used effectively. For example, as shown in Fig. 8, not only "contains" but also "contains any" and "contains all" are very useful when determining whether the utterance is intended for multiplication.

Fig.8 An example of newly developed Text Block


References

[1] Voice Calculator
http://appinventor.mit.edu/explore/resources/ai/voice-calculator

[2] Artificial Intelligence with MIT App Inventor
http://appinventor.mit.edu/explore/ai-with-mit-app-inventor

[3] Regular expression extension (by kevinkun)
https://community.thunkable.com/t/regular-expression-extension/3657

[4] WebView Javascript Processor for App Inventor
https://appinventor.mit.edu/explore/ai2/webview-javascript

[5] App Inventor nb185 is live
https://community.appinventor.mit.edu/t/app-inventor-nb185-is-live/15563

[6] File Path Updates Starting with Android 10, Aug 8, 2020 evan's Blog
http://appinventor.mit.edu/blogs/evan/2020/08/08/file-path-updates-android-10

[7] How to call a simple Javascript function
https://community.appinventor.mit.edu/t/how-to-call-a-simple-javascript-function/15698




0 件のコメント:

コメントを投稿