Tuesday, June 26, 2007

Awash in real interaction data

In the past I've lamented the lack of published VUI user data. If you go to the big design conferences (HFES, CHI, UPA, etc.) you'll find lots of data from studies on web navigation and mobile device use, but very little on speech interfaces. I know this because when I present a study at these conferences the speech session (if there is one) is very small. The science and practice is built on good data, and there is much to go on right now.

There are, in fact, lots of data. Nearly all implementations go through at least one tuning cycle in which caller utterances are recorded and transcribed and matched to the speech recognizer's responses. These data are then analyzed for things such as grammar coverage and the appropriateness of the prompts and so on. Each tuning exercise generates a LOT of data. Unfortunately, the data are usually locked up by the company conducting the tuning exercise, never to see the light of day. There are legitimate reasons why companies don't share their data. Publication of the data is of little value to the company that produces it, and competitors could take advantage of it without responding in kind. Even if scrubbed, data could be used to identify the company or business units for which the data were produced. So I can see why companies are disinclined to distribute their data.

Still, I wonder what sort of intervention could be put in place to motivate companies to share their tuning data for the purpose of improving the best practices in the industry. If anyone has ideas, let me know.

No comments: