Tuesday, June 26, 2007
There are, in fact, lots of data. Nearly all implementations go through at least one tuning cycle in which caller utterances are recorded and transcribed and matched to the speech recognizer's responses. These data are then analyzed for things such as grammar coverage and the appropriateness of the prompts and so on. Each tuning exercise generates a LOT of data. Unfortunately, the data are usually locked up by the company conducting the tuning exercise, never to see the light of day. There are legitimate reasons why companies don't share their data. Publication of the data is of little value to the company that produces it, and competitors could take advantage of it without responding in kind. Even if scrubbed, data could be used to identify the company or business units for which the data were produced. So I can see why companies are disinclined to distribute their data.
Still, I wonder what sort of intervention could be put in place to motivate companies to share their tuning data for the purpose of improving the best practices in the industry. If anyone has ideas, let me know.
Thursday, June 21, 2007
Rule 1: Don't over-engineer your system
Replicants Roy and Leon were built as fighters, presumably to protect property on off-world colonies. They were also created smarter and stronger than humans, and could pass for humans on Earth. Is it a good idea to build autonomous agents that are smarter and stronger than you and give them the capacity and motive to kill? No. It's a bad idea. Build your VUI system to do what its users need it to do and nothing more. And don't try to build something that could pass for a human.
Corollary to Rule 1: Keep your eye on the ROI
The technically sophisticated replicants were designed to self destruct after four years on fears that they could develop uncontrollable feelings and emotions if allowed to live any longer. Replacing advanced technology is extremely expensive - better to build your system to last, and maintain it as needed.
Rule 2: Words are powerful shapers of behavior - be brief and to the point
Deckard and Sebastian were manipulated into doing what others wanted them to do; Deckard into chasing replicants and Sebastian into setting a meeting between Roy and Tyrell. The dialogs required to do this were terse but gave sufficient direction to the targets that they understood what needed to be done. Good VUI dialogs are short and give users direction on what they need to say, without resorting to the painful "In order to do x, say x" construction.
Rule 3: Our methods for evaluating advanced technology isn't good enough - we need better tools
The Voight-Kampff test for determining whether a subject was replicant or human was nearly obsolete. It took Deckard, a trained evaluator, 100 questions to determine Rachael's identity. Our evaluation methods for VUIs are in a similar state. We still mostly rely on usability testing and questionnaires that were developed to evaluate web pages and GUIs. I gave a presentation at a UPA workshop in 2002 and pointed out that our old usability test protocols aren't sufficient for evaluating things like autonomous agents and speech systems. We need better evaluation methods.
Tuesday, June 19, 2007
The little twist introduced in the movie is not over the replicants' supposed intelligence. The replicants clearly are intelligent. The movie even tweaks an old AI argument in one scene when one of the replicants beats its creator in chess. At one time it was said that machines could never play a good game of chess because chess requires insight and intuition, those very high level forms of intelligence, that machines could never possess. What's more, machines can only do what you program them to do, so how can you program something that's better than yourself? Of course, that argument has been settled pretty decisively.
Instead, the movie explores the idea of whether the artificial beings are capable of emotion and feeling. The replicants show a great range of emotion: anger and cunning, loyalty, sadness at the death of one of their group, resentment over the treatment of others in their class, and fear over their impending deaths. In fact, they demonstrate far more emotion than the humans in the movie. Harrison Ford's character is a tough guy detective type who drinks in order to suppress his emotions. His employer shows little concern for him despite his poor condition.
So, the question. If the replicants display what appears to humans as emotion, are they really feeling emotion? If so, do we need to be concerned over their exploitation? Were the replicants emotions something that they were programmed to display, or was this a side effect of having made them so complex? Lots of interesting questions, with no answers forthcoming in the movie. Certainly Ford's detective was affected by the replicants. He feels physically ill at having destroyed one, and it's suggested that the demands of the job are one of the reasons he drinks so much. By the end of the movie, one's sympathies are with the exploited replicants.
What does this have to do with VUI? Besides the obvious fact that the replicants have astonishing speech recognition capabilities (there's not a single "Sorry, I didn't understand" in the whole movie) there may be lessons for those who concern themselves with the design of the VUIs persona. I'll discuss the lessons in my next posting.
Thursday, June 14, 2007
"Calls recorded for quality. GOOG 411 experimental. What city and state?"
It's quick and businesslike. No effort to be cute or fancy. Let me be the first to say (at least I'm think I'm the first) that Google has apparently tried to capture the visual presentation of its web page in an auditory presentation. It succeeds. Its web page is just a white page with a logo, an input box, two buttons, and a small number of links. If you were trying to translate that visual presentation into a VUI you couldn't do a better job than Google has.
By being simple and almost terse, Google created a unique, differentiated experience. If its speech browser's performance is as good as its web site, it will have a winner.
Sunday, June 3, 2007
- The second danger of standards and guidelines: if taken as gospel, they constrain active thinking by the designers who rely on them.
Designers who make informed judgments about stepping outside of common practice to try something new are engaging in what Diego Rodriguez and Ryan Jacoby call "design thinking:" the act of taking risks in order to produce distinctive and usable products. Rodriquez and Jacoby's excellent article on design thinking asserts that designers take risks in order to learn and to excel, but mitigate risks using skills that should be in every designers skill set: prototyping, storytelling, and the ability to actively listen to customers.
So where does that leave user interface guidelines?
In fact, knowing when to ignore rules and push boundaries and when to re-use portions of previous designs and previous practice are both characteristics of good designers. Well-written guidelines are one way that re-use is achieved, and re-use is generally a good thing. The best guidelines are created (designed) from user data and from practice. They're a way of capturing the experience designers have with their designers and putting it into a usable form. The trick is in knowing when to re-use and when to take a chance and design something truly novel.
The VUI design world, in particular, suffers from a lack of published, reliable data on which to make informed decisions about VUI design. Go to any CHI or HFES conference and you'll find loads of studies on web and GUI applications - those domains have been under investigation for years. Not so in VUI design. As some speakers at VUI conferences insist, we really do need better guidelines for design. Before we get there, though, we need more and better data.
Take home message: yes, "design thinking" is good. Guidelines can be misused if taken as gospel. But we need a way to re-use our successes without resorting to copying. That's design thinking as well.