{"id":10275,"date":"2019-04-22T17:51:56","date_gmt":"2019-04-22T15:51:56","guid":{"rendered":"https:\/\/www.hackingchinese.com\/?p=10275"},"modified":"2021-09-21T20:44:45","modified_gmt":"2021-09-21T18:44:45","slug":"using-speech-recognition-to-improve-chinese-pronunciation-part-2","status":"publish","type":"post","link":"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/","title":{"rendered":"How good is voice recognition for learning Chinese pronunciation?"},"content":{"rendered":"<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-10289\" src=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/siri-edit-2.png\" alt=\"\" width=\"750\" height=\"606\" srcset=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/siri-edit-2.png 750w, https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/siri-edit-2-300x242.png 300w\" sizes=\"auto, (max-width: 750px) 100vw, 750px\" \/><\/p>\n<p>In the previous article, we started exploring how speech recognition can be used to improve your Mandarin pronunciation. The main goal of that article was to investigate false negatives, or in other words, when the speech recognition says something is wrong ,but it&#8217;s actually correct.<\/p>\n<h3>Can speech recognition be used to learn Mandarin pronunciation?<\/h3>\n<p>The conclusion was that speech recognition was very good at identifying two-syllable words and sentences, but not as good when it came to single-syllable words. The takeaway for learners was that if your pronunciation is very good, the speech recognition on your phone probably will likely be able to identify what you say.<\/p>\n<p>This article continues where the first one left off. If you haven&#8217;t read that article yet, I suggest that you do so before reading this one. I will try to not repeat to many things from last time and will assume that you have read the first article. You can find it here:<\/p>\n<p><strong><a class=\"row-title\" href=\"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-1\" aria-label=\"\u201cUsing speech recognition to improve Chinese pronunciation, part 1\u201d (Edit)\">Using speech recognition to improve Chinese pronunciation, part 1<\/a><\/strong><\/p>\n<p><em>Tune in to <a href=\"https:\/\/www.hackingchinese.com\/podcast\/\">the Hacking Chinese Podcast<\/a> to listen to the related episode:<\/em><br \/>\n<iframe loading=\"lazy\" src=\"https:\/\/anchor.fm\/hackingchinese\/embed\/episodes\/59---Can-you-use-your-phones-speech-recognition-to-practise-Chinese-pronunciation-e17m5fa\" width=\"400px\" height=\"102px\" frameborder=\"0\" scrolling=\"no\"><\/iframe><br \/>\n<em>Available on <a href=\"https:\/\/podcasts.apple.com\/us\/podcast\/hacking-chinese-podcast\/id1536284827\">Apple Podcasts<\/a>, <a href=\"https:\/\/www.google.com\/podcasts?feed=aHR0cHM6Ly9hbmNob3IuZm0vcy8zODhlYjllOC9wb2RjYXN0L3Jzcw==\">Google Podcast<\/a>, <a href=\"https:\/\/overcast.fm\/itunes1536284827\/hacking-chinese-podcast\">Overcast<\/a>, <a href=\"https:\/\/open.spotify.com\/show\/5iCRv1jg3j3yJZGJlYVYaO\">Spotify<\/a> and many other platforms!<\/em><\/p>\n<h3>How well does speech recognition handle non-native audio?<\/h3>\n<p>In this article, I will try to answer the following question:<\/p>\n<blockquote><p>If I say something and the voice recognition spits out exactly what I intended to say, does that really mean that my pronunciation is good, or could it be that the voice recognition is too lenient?<\/p><\/blockquote>\n<p>We will also look at the question the first article discussed, but now using non-native audio. That question was:<\/p>\n<blockquote><p>If I say something and the voice recognition spits out something else, does that really mean that my pronunciation is bad, or could it be that the voice recognition is wrong?<\/p><\/blockquote>\n<p>For more about the experiment setup and caveats regarding that, please see the first article. Just like last time, the results are split across monosyllabic words, disyllabic words and short phrases.<\/p>\n<h3>A) Monosyllabic words<\/h3>\n<p>The results of the first part, monosyllabic words, are presented in the table below. Each item is presented as follows:<\/p>\n<ol>\n<li><strong>The number<\/strong> of the item<\/li>\n<li><strong>The utterance in Pinyin<\/strong> with attached audio (click to play)<\/li>\n<li><strong>My judgement:<\/strong> If correct, the intended word; if incorrect, problems are pointed out (T means tone, F means final, X means several issues at once; in the original pronunciation check, these were described and explained in detail, but here I have merely indicated what&#8217;s wrong)<\/li>\n<li><strong>My score:<\/strong>\u00a00 means &#8220;this is likely to be perceived as the wrong syllable&#8221; and 3 means &#8220;very likely to be perceived as the right syllable&#8221;.<\/li>\n<li><strong>Google&#8217;s guess:<\/strong> Please note that the software can&#8217;t know which specific character the speaker is reading, so any character with the same pronunciation is considered correct.<\/li>\n<li><strong>Google&#8217; score: <\/strong>One point is earned for each correct identification, out of three possible.<\/li>\n<li><strong>Apple&#8217;s guess:<\/strong> Please note that the software can&#8217;t know which specific character the speaker is reading, so any character with the same pronunciation is considered correct.<\/li>\n<li><strong>Apple&#8217;s&#8217; score:<\/strong> One point is earned for each correct identification, out of three possible.<\/li>\n<\/ol>\n<p>I have analysed pronunciation from two students who both participated in my pronunciation course. This is only part of the material covered and my comments have been limited to fit the format of this article. I chose one female and one male student. To get more reliable results, more students would need to be included, but two should be good enough to get the discussion started.<\/p>\n<h3>Student A<\/h3>\n<table border=\"1\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><strong>Number<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Student<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Olle<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Score<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Google<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Score<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Apple<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Score<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A1<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-bo1.mp3\">b\u014d<\/a><\/td>\n<td style=\"text-align: center;\">\u6ce2<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u64ad<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u62e8<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A2<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-fu4.mp3\">f\u00f9<\/a><\/td>\n<td style=\"text-align: center;\">\u7236<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u9644<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u4ed8<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A3<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-ti2.mp3\">t\u00ed<\/a><\/td>\n<td style=\"text-align: center;\">T<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<td style=\"text-align: center;\">\u63d0<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u4f53<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A4<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-zou3.mp3\">z\u01d2u<\/a><\/td>\n<td style=\"text-align: center;\">\u8d70<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u8d70<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u8d70<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A5<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-er3.mp3\">\u011br<\/a><\/td>\n<td style=\"text-align: center;\">\u8033<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u8033<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u513f<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A6<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-yu3.mp3\">y\u01d4<\/a><\/td>\n<td style=\"text-align: center;\">\u96e8<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u4e0e<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u4e0e<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A7<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-zhi4.mp3\">zh\u00ec<\/a><\/td>\n<td style=\"text-align: center;\">\u5fd7<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">g*<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u636e<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A8<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-sha1.mp3\">sh\u0101<\/a><\/td>\n<td style=\"text-align: center;\">\u6c99<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u867e<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u867e<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A9<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-ce4.mp3\">c\u00e8<\/a><\/td>\n<td style=\"text-align: center;\">\u7b56<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u8272<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u585e<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A10<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-pei2.mp3\">p\u00e9i<\/a><\/td>\n<td style=\"text-align: center;\">X<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u563f<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u563f<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\"><strong>Total<\/strong><\/td>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\"><strong>77%<\/strong><\/td>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\"><strong>50%<\/strong><\/td>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\"><strong>43%<\/strong><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><em>*I\u00a0tried many times but never managed to get an actual Mandarin syllable here.<\/em><\/p>\n<h3>Student B<\/h3>\n<table border=\"1\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><strong>Number<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Student<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Olle<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Score<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Google<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Score<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Apple<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Score<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B1<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-bo1.mp3\">b\u014d<\/a><\/td>\n<td style=\"text-align: center;\">\u6ce2<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u591a<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<td style=\"text-align: center;\">\u55ef<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B2<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-fu4.mp3\">f\u00f9<\/a><\/td>\n<td style=\"text-align: center;\">F<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u6570<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<td style=\"text-align: center;\">\u55ef<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B3<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-ti2.mp3\">t\u00ed<\/a><\/td>\n<td style=\"text-align: center;\">X<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u8bf7<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u671f<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B4<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-zou3.mp3\">z\u01d2u<\/a><\/td>\n<td style=\"text-align: center;\">I<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u624b<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u8d70<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B5<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-er3.mp3\">\u011br<\/a><\/td>\n<td style=\"text-align: center;\">F<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u5076\u5c14<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u5076\u5c14<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B6<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-yu3.mp3\">y\u01d4<\/a><\/td>\n<td style=\"text-align: center;\">\u96e8<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u9c7c<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u4e0e<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B7<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-zhi4.mp3\">zh\u00ec<\/a><\/td>\n<td style=\"text-align: center;\">F<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u8fd9<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u8fd9<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B8<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-sha1.mp3\">sh\u0101<\/a><\/td>\n<td style=\"text-align: center;\">\u6c99<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u4e0a<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u50bb<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B9<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-ce4.mp3\">c\u00e8<\/a><\/td>\n<td style=\"text-align: center;\">I<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u8272<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u662f<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B10<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-pei2.mp3\">p\u00e9i<\/a><\/td>\n<td style=\"text-align: center;\">X<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<td style=\"text-align: center;\">\u9ed1<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u55ef<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\"><strong>Total<\/strong><\/td>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\"><strong>60%<\/strong><\/td>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\"><strong>7%<\/strong><\/td>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\"><strong>20%<\/strong><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3>Discussion<\/h3>\n<p>Average deviations across both students and both software providers:<\/p>\n<ul>\n<li><strong>Items where I gave a higher score:<\/strong> 60% (13\/20)<\/li>\n<li><strong>Items where the speech recognition gave a higher score:<\/strong> 5% (1\/20)<\/li>\n<li><strong>Items where both agreed about the score: <\/strong>30% (7\/20)<\/li>\n<\/ul>\n<p>Looking at the results for these two students, we start seeing some interesting patterns. Naturally, a lot more data would need to be collected and analysed to draw any widely-applicable conclusions, but it seems the speech recognition fails often for these single-syllable words unless the input audio is really good, both in terms of pronunciation and quality.<\/p>\n<p>I consistently rate these students much higher than the score from the speech recognition software indicates, in the case of student B, the difference between my appraisal and Google&#8217;s success rate is huge (60% correct vs. 7% correct). I suspect that audio quality plays a role here, though, as the recording quality from student B is not as good as for Student A or the teacher audio used in the previous article. Background noise is pretty easy for a human to disregard, but I assume it&#8217;s much harder for a computer to do that!<\/p>\n<h3>What this means for you as a learner<\/h3>\n<p>The results here basically tell us that unless your pronunciation is already very clean, you can&#8217;t expect speech recognition to do a good job. It is very likely to judge you more harshly than you deserve. The conclusion for single-syllable words is that speech recognition software can tell you if you&#8217;re near-native, but unless you are, it&#8217;s not very useful and will misunderstand you more than a human would.<\/p>\n<h3>B) Disyllabic words<\/h3>\n<p>Two-syllable words are the backbone of Mandarin and <a href=\"https:\/\/www.hackingchinese.com\/focusing-on-tone-pairs-to-improve-your-mandarin-pronunciation\/\">something I often advise students to focus on when it comes to tones<\/a>.<\/p>\n<p>The columns are the same as above, but scoring works slightly differently. For each error that could cause a syllable to be perceived as another syllable, one point is deducted from a total of three.<\/p>\n<p>For example, getting both the tones wrong, but getting everything else right, would deduct two points (e.g. item A17), whereas getting only one initial wrong on one syllable would deduct only one point (e.g. item A15). Again, T means tone, I initial, F final and X means several errors at once. The number refers to which syllable the error is on.<\/p>\n<p>For the speech recognition columns, a majority vote is used (out of three attempts) and the same scoring as described above is then applied. For example, item A13 was identified two out of three times as \u6c34\u9e21 on iOS, which is wrong both because the tone on the first syllable should be rising rather than low, and because the initial should be\u00a0<em>s<\/em> and not <em>sh<\/em> (two points deducted).<\/p>\n<h3>What this means for you as a learner<\/h3>\n<h3>Student A<\/h3>\n<table border=\"1\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><strong>Number<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Student<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Olle<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Correct<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Google<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Correct<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Apple<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Correct<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A11<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-nv3ren2.mp3\">n\u01dar\u00e9n<\/a><\/td>\n<td style=\"text-align: center;\">1:T,2:T<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<td style=\"text-align: center;\">\u4f60\u4eba<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u5973\u4eba<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A12<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-er3duo.mp3\">\u011brduo<\/a><\/td>\n<td style=\"text-align: center;\">1:T<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u8033\u6735<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u8033\u6735<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A13<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-sui2ji1.mp3\">su\u00edj\u012b<\/a><\/td>\n<td style=\"text-align: center;\">1:T<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u6c34\u673a<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<td style=\"text-align: center;\">\u6c34\u9e21<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A14<\/td>\n<td style=\"text-align: center;\">\u00a0<a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-xiang4xia4.mp3\">xi\u00e0ngxi\u00e0<\/a><\/td>\n<td style=\"text-align: center;\">2:F<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u5411\u4e0a<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<td style=\"text-align: center;\">\u5411\u4e0b<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A15<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-pin2qiong2.mp3\">p\u00ednqi\u00f3ng<\/a><\/td>\n<td style=\"text-align: center;\">2:F<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u8d2b\u7a77<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u8d2b\u7a77<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A16<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-lao3shi1.mp3\">l\u01ceosh\u012b<\/a><\/td>\n<td style=\"text-align: center;\">\u8001\u5e08<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u8001\u5e08<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u8001\u5e08<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A17<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-qing1chu.mp3\">q\u012bngchu<\/a><\/td>\n<td style=\"text-align: center;\">1:I, 2:T<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<td style=\"text-align: center;\">\u8bf7\u51fa<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u79e6\u743c<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A18<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-liao3jie3.mp3\">li\u01ceoji\u011b<\/a><\/td>\n<td style=\"text-align: center;\">1:T<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u4e86\u89e3<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u4e86\u89e3<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A19<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-run4ze2.mp3\">r\u00f9nz\u00e9<\/a><\/td>\n<td style=\"text-align: center;\">2:T<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u65e5\u8bed\u8bcd<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u5f15\u5b50<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">A20<\/td>\n<td style=\"text-align: center;\">\u00a0<a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-bo2zi.mp3\">b\u00f3zi<\/a><\/td>\n<td style=\"text-align: center;\">1:T<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u6ce2\u5b50<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u58a8\u5b50<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\"><strong>Total<\/strong><\/td>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\"><strong>63%<\/strong><\/td>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\"><strong>60%<\/strong><\/td>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\"><strong>67%<\/strong><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3>Student B<\/h3>\n<table border=\"1\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><strong>Number<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Student<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Olle<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Correct<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Google<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Correct<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Apple<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Correct<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B11<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-nv3ren2.mp3\">n\u01dar\u00e9n<\/a><\/td>\n<td style=\"text-align: center;\">1:F, 2T<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<td style=\"text-align: center;\">\u5973\u4eba<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u725b\u4eba<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B12<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-er3duo.mp3\">\u011brduo<\/a><\/td>\n<td style=\"text-align: center;\">1:T<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u513f\u5b50<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u513f\u6b4c<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B13<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-sui2ji1.mp3\">su\u00edj\u012b<\/a><\/td>\n<td style=\"text-align: center;\">1:T<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u6c34\u673a<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u6c34\u6676<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B14<\/td>\n<td style=\"text-align: center;\">\u00a0<a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-xiang4xia4.mp3\">xi\u00e0ngxi\u00e0<\/a><\/td>\n<td style=\"text-align: center;\">\u5411\u4e0b<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u5411\u4e0b<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u5411\u5411<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B15<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-pin2qiong2.mp3\">p\u00ednqi\u00f3ng<\/a><\/td>\n<td style=\"text-align: center;\">\u8d2b\u7a77<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<td style=\"text-align: center;\">\u51ed\u7965<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u8d2b\u7a77<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B16<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-lao3shi1.mp3\">l\u01ceosh\u012b<\/a><\/td>\n<td style=\"text-align: center;\">1:T, 2:F<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<td style=\"text-align: center;\">\u72fc\u86c7<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u72fc\u795e<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B17<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-qing1chu.mp3\">q\u012bngchu<\/a><\/td>\n<td style=\"text-align: center;\">1:F, 2:T<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<td style=\"text-align: center;\">\u9752\u4e18<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u6e05\u695a<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B18<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-liao3jie3.mp3\">li\u01ceoji\u011b<\/a><\/td>\n<td style=\"text-align: center;\">2:I<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u8868\u59d0<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<td style=\"text-align: center;\">\u4e86\u89e3<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B19<\/td>\n<td style=\"text-align: center;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-run4ze2.mp3\">r\u00f9nz\u00e9<\/a><\/td>\n<td style=\"text-align: center;\">X<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u672c\u5b50<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<td style=\"text-align: center;\">\u5375\u5b50<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">B20<\/td>\n<td style=\"text-align: center;\">\u00a0<a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-bo2zi.mp3\">b\u00f3zi<\/a><\/td>\n<td style=\"text-align: center;\">1:T, 2T<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<td style=\"text-align: center;\">\u513f\u5b50<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<td style=\"text-align: center;\">\u767e\u8272<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\"><strong>Total<\/strong><\/td>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\"><strong>53%<\/strong><\/td>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\"><strong>30%<\/strong><\/td>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\"><strong>40%<\/strong><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Average deviations across both students and both software providers:<\/p>\n<ul>\n<li><strong>Items where I gave a higher score:<\/strong> 50% (10\/20)<\/li>\n<li><strong>Items where the speech recognition gave a higher score:<\/strong> 35% (7\/20)<\/li>\n<li><strong>Items where both agreed about the score: <\/strong>15% (3\/20)<\/li>\n<\/ul>\n<p>Here we can see things evening out a bit, with my assessment being about the same as speech recognition for student A and just a bit higher for student B (not the order of magnitude difference we saw for single syllables).<\/p>\n<p>One thing that probably influences the results quite a bit here, but which is very hard to control for, is that it matters if there is a word that lies close to what the student says. When I listen to a student, I don&#8217;t have to guess at a word, I just write down what I hear, but the way speech recognition on smart phones work, they will always guess at something.<\/p>\n<p>For example, in the case of A13 and B13, I hear nothing that indicates that they are saying\u00a0<em>sh\u00a0<\/em>rather than\u00a0<em>s<\/em>, yet both iOS and Android hear\u00a0<em>s.\u00a0<\/em>Or do they? Probably not, it&#8217;s just that there is no two-syllable word with that tone contour (the only common character with a low tone on the syllable\u00a0<em>sui\u00a0<\/em>is \u9ad3, but that doesn&#8217;t make sense with a\u00a0<em>j\u012b<\/em> coming after it).<\/p>\n<h3><strong>What this means for you as a learner<\/strong><\/h3>\n<p>Using two-syllable words works a lot better for pronunciation practice than does single-syllable words (that&#8217;s true when you practise with humans as well). I doesn&#8217;t seem like the speech recognition is too lenient here, rather the opposite, i.e. that small imperfections in pronunciation can throw it off completely, guessing a different word which is far from what you said. So, you can probably use two-syllable words for practice, but use it only as a binary indicator (right or wrong, not what you said incorrectly or how serious it is).<\/p>\n<h3>C) Phrases<\/h3>\n<p>Now we&#8217;re approaching the home territory where voice recognition software ought to be good at guessing. It&#8217;s not normal for people to ambush their phones by suddenly saying strange things like &#8220;glossy&#8221;, but it is normal to dictate a sentence or ask a question. Let&#8217;s see if it works as well as it ought to!<\/p>\n<p>Scoring here works the same way as for the disyllabic words above, i.e. each error that could shift a syllable to a different meaning deducts one point. The maximum for each item is still 3.<\/p>\n<h3>Student A<\/h3>\n<table border=\"1\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><strong>A21<\/strong><\/td>\n<td style=\"text-align: left;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-ta1shi4bushiwang2lao3shi1.mp3\">T\u0101 sh\u00ecbush\u00ec W\u00e1ng l\u01ceosh\u012b.<\/a><\/td>\n<td style=\"text-align: center;\"><b>Score<\/b><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Olle<\/td>\n<td style=\"text-align: left;\">Slight rise on second \u662f.<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Google<\/td>\n<td style=\"text-align: left;\">\u4ed6\u662f\u4e0d\u662f\u738b\u8001\u5e08<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Apple<\/td>\n<td style=\"text-align: left;\">\u4ed6\u662f\u4e0d\u662f\u738b\u8001\u5e08<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>A22<\/strong><\/td>\n<td style=\"text-align: left;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-ma2fanni3ba3yan2di4gei3wo3.mp3\">M\u00e1fan n\u01d0 b\u01ce y\u00e1n d\u00ec g\u011bi w\u01d2.<\/a><\/td>\n<td style=\"text-align: center;\"><b>Score<\/b><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Olle<\/td>\n<td style=\"text-align: left;\">Wrong tones on \u4f60, \u76d0, \u9012.<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Google<\/td>\n<td style=\"text-align: left;\">\u9ebb\u70e6\u60a8\u628a\u9a8c\u7684\u7ed9\u6211<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Apple<\/td>\n<td style=\"text-align: left;\">\u9ebb\u70e6\u60a8\u628a\u786c\u5e01\u7ed9\u6211<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>A23<\/strong><\/td>\n<td style=\"text-align: left;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-wo3yi2ding4yao4qu4mei3guo2.mp3\">W\u01d2 y\u00edd\u00ecng y\u00e0o q\u00f9 M\u011bigu\u00f3.<\/a><\/td>\n<td style=\"text-align: center;\"><b>Score<\/b><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Olle<\/td>\n<td style=\"text-align: left;\">Wrong tones on \u4e00, \u7f8e; initial+final on \u53bb.<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Google<\/td>\n<td style=\"text-align: left;\">\u6211\u4e00\u5b9a\u8981\u53bb\u7f8e\u56fd<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Apple<\/td>\n<td style=\"text-align: left;\">\u6211\u4e00\u5b9a\u8981\u53bb\u7f8e\u56fd<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>A24<\/strong><\/td>\n<td style=\"text-align: left;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-qing3wen4wo3ke3yi3jin4lai2ma.mp3\">Q\u01d0ngw\u00e8n, w\u01d2 k\u011by\u01d0 j\u00ecnlai ma?<\/a><\/td>\n<td style=\"text-align: center;\"><b>Score<\/b><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Olle<\/td>\n<td style=\"text-align: left;\">Wrong tones on \u8fdb, \u6765.<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Google<\/td>\n<td style=\"text-align: left;\">\u8bf7\u95ee\u6211\u53ef\u4ee5\u5c3d\u5356\u5417<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Apple<\/td>\n<td style=\"text-align: left;\">\u8bf7\u95ee\u6211\u53ef\u4ee5\u8fdb\u6765\u5417<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>A25<\/strong><\/td>\n<td style=\"text-align: left;\">\u00a0<a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/a-fang2zu1yi2gong4shi4yi4qian1wu3bai3yi1shi2yuan2.mp3\">F\u00e1ngz\u016b y\u00edg\u00f2ng sh\u00ec y\u00ecqi\u0101n w\u01d4b\u01cei y\u012bsh\u00ed yu\u00e1n.<\/a><\/td>\n<td style=\"text-align: center;\"><b>Score<\/b><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Olle<\/td>\n<td style=\"text-align: left;\">Wrong tone on \u4e00\u5341.<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Google<\/td>\n<td style=\"text-align: left;\">\u623f\u79df\u4e00\u5171\u662f1510\u5143<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Apple<\/td>\n<td style=\"text-align: left;\">\u00a0\u623f\u79df\u4e00\u5171\u662f1510\u5143<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<ul>\n<li><strong>Olle&#8217;s score:<\/strong> 33%<\/li>\n<li><strong>Google&#8217;s score:<\/strong> 66%<\/li>\n<li><strong>Apple&#8217;s score:\u00a0<\/strong>80%<\/li>\n<\/ul>\n<h3>Student B<\/h3>\n<table border=\"1\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><strong>B21<\/strong><\/td>\n<td style=\"text-align: left;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-ta1shi4bushiwang2lao3shi1.mp3\">T\u0101 sh\u00ecbush\u00ec W\u00e1ng l\u01ceosh\u012b.<\/a><\/td>\n<td style=\"text-align: center;\"><b>Score<\/b><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Olle<\/td>\n<td style=\"text-align: left;\">Wrong tone on \u738b.<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Google<\/td>\n<td style=\"text-align: left;\">\u4ed6\u662f\u4e0d\u662f\u738b\u8001\u5e08<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Apple<\/td>\n<td style=\"text-align: left;\">\u55ef\u662f\u4e0d\u662f\u738b\u5b9d\u65af<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>B22<\/strong><\/td>\n<td style=\"text-align: left;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-ma2fanni3ba3yan2di4gei3wo3.mp3\">M\u00e1fan n\u01d0 b\u01ce y\u00e1n d\u00ec g\u011bi w\u01d2.<\/a><\/td>\n<td style=\"text-align: center;\"><b>Score<\/b><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Olle<\/td>\n<td style=\"text-align: left;\">Wrong tones on \u76d0, wrong final on \u6211.<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Google<\/td>\n<td style=\"text-align: left;\">\u9ebb\u70e6\u4f60\u628a\u76d0\u9012\u7ed9\u6211<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Apple<\/td>\n<td style=\"text-align: left;\">\u9ebb\u70e6\u4f60\u628a\u989c\u5b81\u51e0\u70b9\u6211<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>B23<\/strong><\/td>\n<td style=\"text-align: left;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-wo3yi2ding4yao4qu4mei3guo2.mp3\">W\u01d2 y\u00edd\u00ecng y\u00e0o q\u00f9 M\u011bigu\u00f3.<\/a><\/td>\n<td style=\"text-align: center;\"><b>Score<\/b><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Olle<\/td>\n<td style=\"text-align: left;\">Unclear tones on \u8981 and \u53bb, wrong on \u7f8e; and final on \u53bb.<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Google<\/td>\n<td style=\"text-align: left;\">\u6211\u4e00\u5b9a\u8981\u53bb\u7f8e\u56fd<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#58D68D\">3<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Apple<\/td>\n<td style=\"text-align: left;\">\u6211\u4e00\u5b9a\u8981\u5a36\u4f60\u56de\u56fd<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>B24<\/strong><\/td>\n<td style=\"text-align: left;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-qing3wen4wo3ke3yi3jin4lai2ma.mp3\">Q\u01d0ngw\u00e8n, w\u01d2 k\u011by\u01d0 j\u00ecnlai ma?<\/a><\/td>\n<td style=\"text-align: center;\"><b>Score<\/b><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Olle<\/td>\n<td style=\"text-align: left;\">Wrong tone on \u8fdb.<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Google<\/td>\n<td style=\"text-align: left;\">\u8bf7\u95ee\u554a\u53ef\u4ee5\u5faa\u73af\u5417<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Apple<\/td>\n<td style=\"text-align: left;\">\u8bf7\u95ee\u6211\u53ef\u4ee5\u53bb\u73a9\u561b<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>B25<\/strong><\/td>\n<td style=\"text-align: left;\"><a href=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/b-fang2zu1yi2gong4shi4yi4qian1wu3bai3yi1shi2yuan2.mp3\">F\u00e1ngz\u016b y\u00edg\u00f2ng sh\u00ec y\u00ecqi\u0101n w\u01d4b\u01cei y\u012bsh\u00ed yu\u00e1n.<\/a><\/td>\n<td style=\"text-align: center;\"><b>Score<\/b><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Olle<\/td>\n<td style=\"text-align: left;\">Wrong tones on \u5171 and \u5343.<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#FADBD8\">1<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Google<\/td>\n<td style=\"text-align: left;\">\u623f\u79df\u4e00\u5171\u662f1550\u5143<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#D4EFDF\">2<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Apple<\/td>\n<td style=\"text-align: left;\">\u623f\u79df\u4e00\u4e2d\u5b661510\u5143<\/td>\n<td style=\"text-align: center;\" bgcolor=\"#E74C3C\">0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<ul>\n<li><strong>Olle&#8217;s score:<\/strong> 40%<\/li>\n<li><strong>Google&#8217;s score:<\/strong> 85%<\/li>\n<li><strong>Apple&#8217;s score:\u00a0<\/strong>0%<\/li>\n<\/ul>\n<h3>Discussion<\/h3>\n<p>Here we see the first major deviation between the two speech recognition providers. For student B, Google gave a score of 85%, compared with my rating of 40%, more than twice as high. But for the same student, Apple gave a score of 0%! My guess is that this is mostly because of the low audio quality and that the data from student B should probably be disregarded because of this.<\/p>\n<p>What we can learn from this as students is that at least if you&#8217;re on an Apple phone, you need a good recording environment! That advice is of course applicable to any student, but here it probably made up most the difference between a 0% score and a 85% score.<\/p>\n<p>For student A with higher audio quality, though, the result is quite expected: speech recognition is considerably more lenient than I am. This is again because of context and the fact that there are only so many sentences that make sense (even though some of the suggested sentences don&#8217;t make much sense either).<\/p>\n<p>If anyone adapts this as a serious research project, audio quality should of course be kept constant to confirm that my hypothesis here is correct.<\/p>\n<h3>What this means for you as a learner<\/h3>\n<p>If we disregard the low audio quality, speech recognition lets you think that your pronunciation is better than it actually is. The score from both Google and Apple is about 50% to 100% higher than my manual assessment. This ties in well with what I said in the introduction of the first article: Speech recognition is not meant to give you a fair assessment of your pronunciation, it&#8217;s designed to understand what you want to say. The more clues it gets, the better it will guess.<\/p>\n<h3>General conclusion<\/h3>\n<p>In these two articles, I&#8217;ve tried to answer the question if speech recognition can be used to check your Mandarin pronunciation. As I have reminded readers of through the two articles, the results here can only be tentative, so take these notes with a pinch of salt.<\/p>\n<ol>\n<li><strong> Speech recognition is next to useless for single-syllable words<\/strong>, unless you just want to verify that your pronunciation is native-like (and it still might not work; see the results for dipping third tones). Getting single syllables right means that your pronunciation is very good, but don&#8217;t be discouraged if your phone doesn&#8217;t understand you; it might not be your fault!<\/li>\n<li><strong>Speech recognition works better for two-syllable words<\/strong>, which is nice since you should probably practise those anyway. Here, it&#8217;s certainly possible to get it 100% right if your pronunciation is good enough, but don&#8217;t be too discouraged if your phone thinks your saying something completely different, because even a small mistake can throw it off. However, if you say it right, it will probably recognise what you say.<\/li>\n<li><strong>Speech recognition is probably too lenient for sentences.\u00a0<\/strong>Provided that the sentence is fairly common, you need to make several errors at once to derail the speech recognition algorithm. You can not assume that your pronunciation is good just because your phone writes out the right sentence, but you can be fairly sure your pronunciation is not good if it doesn&#8217;t understand you at all.<\/li>\n<\/ol>\n<p>That&#8217;s it for now! I would be happy to hear about your experience with speech recognition. Perhaps you can try the words and phrases used in this article and see if your phone can transcribe them correctly? Please share the results in the comments!<\/p>\n<p>It will be interesting to follow how this develops in the future. As speech recognition becomes even better at handling non-native accents, we should see its usefulness for checking pronunciation decrease as the software will become better at understanding incorrectly pronounced sentences. This is probably true for words as well, but since there&#8217;s so little context to use there, I doubt the situation will change quickly in that area.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Speech recognition technology has developed rapidly and can now be relied on to correctly identify standardised and clear pronunciation in Mandarin. But can it be used to check your Mandarin pronunciation? Not necessarily. This article looks at how well speech recognition software deals with non-native and low-quality audio, focusing on the question if speech recognition is too lenient for pronunciation practice.<\/p>\n","protected":false},"author":1,"featured_media":10289,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3,5,6,10,14,20,21],"tags":[45,973,238,977,456,974,976,975,779,611,972],"class_list":["post-10275","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-advanced","category-beginner","category-distinctively-chinese","category-intermediate","category-learning-outside-class","category-science-and-research","category-speaking","tag-android","tag-apple","tag-google","tag-ios","tag-pronunciation","tag-siri","tag-speech-recognition","tag-speech-to-text","tag-technology","tag-tones","tag-voice-recognition"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>How good is voice recognition for learning Chinese pronunciation?<\/title>\n<meta name=\"description\" content=\"Speech recognition technology has developed rapidly and can now be relied on to correctly identify standardised and clear pronunciation in Mandarin. But can it be used to check your Mandarin pronunciation? Not necessarily. This article looks at how well speech recognition software deals with non-native and low-quality audio, focusing on the question if speech recognition is too lenient for pronunciation practice.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/\" \/>\n<meta property=\"og:locale\" content=\"en_GB\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How good is voice recognition for learning Chinese pronunciation?\" \/>\n<meta property=\"og:description\" content=\"Speech recognition technology has developed rapidly and can now be relied on to correctly identify standardised and clear pronunciation in Mandarin. But can it be used to check your Mandarin pronunciation? Not necessarily. This article looks at how well speech recognition software deals with non-native and low-quality audio, focusing on the question if speech recognition is too lenient for pronunciation practice.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/\" \/>\n<meta property=\"og:site_name\" content=\"Hacking Chinese\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/HackingChinese\" \/>\n<meta property=\"article:published_time\" content=\"2019-04-22T15:51:56+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2021-09-21T18:44:45+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/siri-edit-2.png\" \/>\n\t<meta property=\"og:image:width\" content=\"750\" \/>\n\t<meta property=\"og:image:height\" content=\"606\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Olle Linge\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@HackingChinese\" \/>\n<meta name=\"twitter:site\" content=\"@HackingChinese\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Olle Linge\" \/>\n\t<meta name=\"twitter:label2\" content=\"Estimated reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"12 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\\\/\"},\"author\":{\"name\":\"Olle Linge\",\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/#\\\/schema\\\/person\\\/fd696a7384c7de665cc9d67c15205b15\"},\"headline\":\"How good is voice recognition for learning Chinese pronunciation?\",\"datePublished\":\"2019-04-22T15:51:56+00:00\",\"dateModified\":\"2021-09-21T18:44:45+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\\\/\"},\"wordCount\":2541,\"commentCount\":1,\"publisher\":{\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.hackingchinese.com\\\/wp-content\\\/uploads\\\/2019\\\/03\\\/siri-edit-2.png\",\"keywords\":[\"Android\",\"Apple\",\"Google\",\"iOS\",\"Pronunciation\",\"Siri\",\"Speech recognition\",\"Speech to text\",\"Technology\",\"Tones\",\"Voice recognition\"],\"articleSection\":[\"Advanced\",\"Beginner\",\"Distinctively Chinese\",\"Intermediate\",\"Learning outside class\",\"Science and research\",\"Speaking\"],\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.hackingchinese.com\\\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\\\/\",\"url\":\"https:\\\/\\\/www.hackingchinese.com\\\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\\\/\",\"name\":\"How good is voice recognition for learning Chinese pronunciation?\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.hackingchinese.com\\\/wp-content\\\/uploads\\\/2019\\\/03\\\/siri-edit-2.png\",\"datePublished\":\"2019-04-22T15:51:56+00:00\",\"dateModified\":\"2021-09-21T18:44:45+00:00\",\"description\":\"Speech recognition technology has developed rapidly and can now be relied on to correctly identify standardised and clear pronunciation in Mandarin. But can it be used to check your Mandarin pronunciation? Not necessarily. This article looks at how well speech recognition software deals with non-native and low-quality audio, focusing on the question if speech recognition is too lenient for pronunciation practice.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\\\/#breadcrumb\"},\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.hackingchinese.com\\\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\\\/#primaryimage\",\"url\":\"https:\\\/\\\/www.hackingchinese.com\\\/wp-content\\\/uploads\\\/2019\\\/03\\\/siri-edit-2.png\",\"contentUrl\":\"https:\\\/\\\/www.hackingchinese.com\\\/wp-content\\\/uploads\\\/2019\\\/03\\\/siri-edit-2.png\",\"width\":750,\"height\":606},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.hackingchinese.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"How good is voice recognition for learning Chinese pronunciation?\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/#website\",\"url\":\"https:\\\/\\\/www.hackingchinese.com\\\/\",\"name\":\"Hacking Chinese\",\"description\":\"A better way of learning Mandarin\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.hackingchinese.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-GB\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/#organization\",\"name\":\"Hacking Chinese\",\"url\":\"https:\\\/\\\/www.hackingchinese.com\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.hackingchinese.com\\\/wp-content\\\/uploads\\\/2010\\\/09\\\/square-stamp-1000.png\",\"contentUrl\":\"https:\\\/\\\/www.hackingchinese.com\\\/wp-content\\\/uploads\\\/2010\\\/09\\\/square-stamp-1000.png\",\"width\":1000,\"height\":1000,\"caption\":\"Hacking Chinese\"},\"image\":{\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/HackingChinese\",\"https:\\\/\\\/x.com\\\/HackingChinese\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.hackingchinese.com\\\/#\\\/schema\\\/person\\\/fd696a7384c7de665cc9d67c15205b15\",\"name\":\"Olle Linge\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/7d60e40795941ec743c532d9ba9a94d261cd89f55ab4a7a0a8271040e7046559?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/7d60e40795941ec743c532d9ba9a94d261cd89f55ab4a7a0a8271040e7046559?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/7d60e40795941ec743c532d9ba9a94d261cd89f55ab4a7a0a8271040e7046559?s=96&d=mm&r=g\",\"caption\":\"Olle Linge\"},\"description\":\"Hi! My name is Olle Linge (\u51cc\u96f2\u9f8d) and I'm the creator and editor of Hacking Chinese. Read more about the website and me on the About page.\",\"sameAs\":[\"http:\\\/\\\/www.hackingchinese.com\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"How good is voice recognition for learning Chinese pronunciation?","description":"Speech recognition technology has developed rapidly and can now be relied on to correctly identify standardised and clear pronunciation in Mandarin. But can it be used to check your Mandarin pronunciation? Not necessarily. This article looks at how well speech recognition software deals with non-native and low-quality audio, focusing on the question if speech recognition is too lenient for pronunciation practice.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/","og_locale":"en_GB","og_type":"article","og_title":"How good is voice recognition for learning Chinese pronunciation?","og_description":"Speech recognition technology has developed rapidly and can now be relied on to correctly identify standardised and clear pronunciation in Mandarin. But can it be used to check your Mandarin pronunciation? Not necessarily. This article looks at how well speech recognition software deals with non-native and low-quality audio, focusing on the question if speech recognition is too lenient for pronunciation practice.","og_url":"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/","og_site_name":"Hacking Chinese","article_publisher":"https:\/\/www.facebook.com\/HackingChinese","article_published_time":"2019-04-22T15:51:56+00:00","article_modified_time":"2021-09-21T18:44:45+00:00","og_image":[{"width":750,"height":606,"url":"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/siri-edit-2.png","type":"image\/png"}],"author":"Olle Linge","twitter_card":"summary_large_image","twitter_creator":"@HackingChinese","twitter_site":"@HackingChinese","twitter_misc":{"Written by":"Olle Linge","Estimated reading time":"12 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/#article","isPartOf":{"@id":"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/"},"author":{"name":"Olle Linge","@id":"https:\/\/www.hackingchinese.com\/#\/schema\/person\/fd696a7384c7de665cc9d67c15205b15"},"headline":"How good is voice recognition for learning Chinese pronunciation?","datePublished":"2019-04-22T15:51:56+00:00","dateModified":"2021-09-21T18:44:45+00:00","mainEntityOfPage":{"@id":"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/"},"wordCount":2541,"commentCount":1,"publisher":{"@id":"https:\/\/www.hackingchinese.com\/#organization"},"image":{"@id":"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/#primaryimage"},"thumbnailUrl":"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/siri-edit-2.png","keywords":["Android","Apple","Google","iOS","Pronunciation","Siri","Speech recognition","Speech to text","Technology","Tones","Voice recognition"],"articleSection":["Advanced","Beginner","Distinctively Chinese","Intermediate","Learning outside class","Science and research","Speaking"],"inLanguage":"en-GB","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/","url":"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/","name":"How good is voice recognition for learning Chinese pronunciation?","isPartOf":{"@id":"https:\/\/www.hackingchinese.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/#primaryimage"},"image":{"@id":"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/#primaryimage"},"thumbnailUrl":"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/siri-edit-2.png","datePublished":"2019-04-22T15:51:56+00:00","dateModified":"2021-09-21T18:44:45+00:00","description":"Speech recognition technology has developed rapidly and can now be relied on to correctly identify standardised and clear pronunciation in Mandarin. But can it be used to check your Mandarin pronunciation? Not necessarily. This article looks at how well speech recognition software deals with non-native and low-quality audio, focusing on the question if speech recognition is too lenient for pronunciation practice.","breadcrumb":{"@id":"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/#breadcrumb"},"inLanguage":"en-GB","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/"]}]},{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/#primaryimage","url":"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/siri-edit-2.png","contentUrl":"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2019\/03\/siri-edit-2.png","width":750,"height":606},{"@type":"BreadcrumbList","@id":"https:\/\/www.hackingchinese.com\/using-speech-recognition-to-improve-chinese-pronunciation-part-2\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.hackingchinese.com\/"},{"@type":"ListItem","position":2,"name":"How good is voice recognition for learning Chinese pronunciation?"}]},{"@type":"WebSite","@id":"https:\/\/www.hackingchinese.com\/#website","url":"https:\/\/www.hackingchinese.com\/","name":"Hacking Chinese","description":"A better way of learning Mandarin","publisher":{"@id":"https:\/\/www.hackingchinese.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.hackingchinese.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-GB"},{"@type":"Organization","@id":"https:\/\/www.hackingchinese.com\/#organization","name":"Hacking Chinese","url":"https:\/\/www.hackingchinese.com\/","logo":{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/www.hackingchinese.com\/#\/schema\/logo\/image\/","url":"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2010\/09\/square-stamp-1000.png","contentUrl":"https:\/\/www.hackingchinese.com\/wp-content\/uploads\/2010\/09\/square-stamp-1000.png","width":1000,"height":1000,"caption":"Hacking Chinese"},"image":{"@id":"https:\/\/www.hackingchinese.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/HackingChinese","https:\/\/x.com\/HackingChinese"]},{"@type":"Person","@id":"https:\/\/www.hackingchinese.com\/#\/schema\/person\/fd696a7384c7de665cc9d67c15205b15","name":"Olle Linge","image":{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/secure.gravatar.com\/avatar\/7d60e40795941ec743c532d9ba9a94d261cd89f55ab4a7a0a8271040e7046559?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/7d60e40795941ec743c532d9ba9a94d261cd89f55ab4a7a0a8271040e7046559?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/7d60e40795941ec743c532d9ba9a94d261cd89f55ab4a7a0a8271040e7046559?s=96&d=mm&r=g","caption":"Olle Linge"},"description":"Hi! My name is Olle Linge (\u51cc\u96f2\u9f8d) and I'm the creator and editor of Hacking Chinese. Read more about the website and me on the About page.","sameAs":["http:\/\/www.hackingchinese.com"]}]}},"_links":{"self":[{"href":"https:\/\/www.hackingchinese.com\/wp-json\/wp\/v2\/posts\/10275","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.hackingchinese.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.hackingchinese.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.hackingchinese.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.hackingchinese.com\/wp-json\/wp\/v2\/comments?post=10275"}],"version-history":[{"count":16,"href":"https:\/\/www.hackingchinese.com\/wp-json\/wp\/v2\/posts\/10275\/revisions"}],"predecessor-version":[{"id":15075,"href":"https:\/\/www.hackingchinese.com\/wp-json\/wp\/v2\/posts\/10275\/revisions\/15075"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.hackingchinese.com\/wp-json\/wp\/v2\/media\/10289"}],"wp:attachment":[{"href":"https:\/\/www.hackingchinese.com\/wp-json\/wp\/v2\/media?parent=10275"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.hackingchinese.com\/wp-json\/wp\/v2\/categories?post=10275"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.hackingchinese.com\/wp-json\/wp\/v2\/tags?post=10275"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}