Death or birth?

« previous post | next post »

The most recent IEEE Signal Processing Society Newsletter has an interesting article by David Suendermann, "Speech scientists are dead. Interaction designers are dead. Who is next?".

His argument is that "Commercial spoken dialog systems can process millions of calls per week", and therefore "one can implement a variety of changes at different points in the application and randomly choose one competitor every time the point is hit in the course of a call", using techniques like reinforcement learning to adaptively optimize the design. As a result, "the contender approach can change the life of interaction designers and speech scientists in that best practices and experience-based decisions can be replaced by straight-forward implementation of every alternative one can think of".

I yield to no one in my appreciation for Big Data, eScience (or in this case, I guess, eEngineering…), the Fourth Paradigm, and all that. And reinforcement learning is a fine technique, with interesting roots in the extraordinary Rescorla-Wagner model of classical conditioning (though there are other ideas around). Everyone should know about this stuff, and apply it where it works, in spoken dialog system optimization as elsewhere.

But I think that David goes (or implies going) way too far: IMHO, the massive data accumulation in the digital networking of the whole world is going to increase, not decrease, the demand for scientists and engineers. I don't have time to say anything more about it, for now, so feel free to discuss the question among yourselves.

[Update — Fernando Pereira, who is in a good position to know, has a quick list of contrary bullet points:

  • Even with all that data — and I agree it's growing fast as voice interaction on smart phones becomes ubiquitous — randomized search will be swamped by the combinatorial possibilities of interface design.
  • The only way to manage the combinatorics is to impose intelligent biases on the search process. That is, we need engineers and designers who understand and know how to apply the relevant computer science and statistics.
  • Automated tools do not achieve good designs by themselves, because we do not know how to quantify good design as a mathematical objective function, even if the combinatorial problem could be tamed. We need human designers to steer the tools, evaluate the results, and recognize potential disasters. Their training may be different, but they are not 'dead'.

]



12 Comments »

  1. Mark P said,

    April 23, 2010 @ 12:23 pm

    I imagine there are lots of practical problems that "implementation of every alternative one can think of" can solve. Isn't that the way computer programs beat chess masters? On the other hand, it may not lead to much understanding. In the history of science, the analysis of large (for the time) quantities of astronomical observations led to a relatively good predictive model involving epicycles. I suppose continued implementation of every alternative one can think of would have led to a Sun-centric model of the solar system, but if the Earth-centric model could predict as well (or as well as necessary), why choose one over the other? Aside from the fact that an Earth-centric model would have made sending probes to Mars harder.

  2. Nick Lamb said,

    April 23, 2010 @ 12:29 pm

    "Rich Phone Applications" is what David's company calls their business sector. I think by this they mean (Rich (Phone Applications)). But this sector is dying, and the reason isn't anything to do with Big Data or the Fourth Paradigm, it is the increasing dominance of ((Rich Phone) Applications). Instead of literally talking to a machine (which remains awkward) people are happy to communicate with it by pointing, tapping and gesturing at a touch screen interface they carry in their pockets. The telephone got smarter.

    Nobody wants to fight their way through voice prompt menus, no matter how "optimal". Telephone helplines should be reserved for situations where a (human) representative of the company needs to talk to the (also human) customer. They're a lousy alternative to the GUI for interacting with machines.

  3. Nick Lamb said,

    April 23, 2010 @ 12:40 pm

    Mark P, actually your instinct is correct, get the maths right and you can get the same answers with a Heliocentric or a Geocentric model. Or you can put a small rock near Jupiter in the middle, it doesn't matter. Check out the 1905 paper "Zur Elektrodynamik bewegter Körper" by a certain Albert Einstein and his later 1915 work which makes it all work properly with gravity. It turns out there is no privileged frame of reference, so you can just pick whichever is convenient.

  4. peter said,

    April 23, 2010 @ 1:08 pm

    When SPSS and similar statistical analysis software programs began appearing in the mid-1960s, many statisticians thought that the widespread use of these programs would result in unemployment for statisticians. In fact, lowering the expertise-threshold necessary for analysis of statistical data increased the demand for people with high levels of statistical expertise — to advise (and to rectify the work of) those without adequate expertise.

  5. Jens Fiederer said,

    April 23, 2010 @ 2:27 pm

    "Commercial spoken dialog systems can process millions of calls per week" reminds me of that skit they used to have so many versions of on Saturday Night Live.

    It was about Toonces, the cat who could drive a car. Pretty much every episode ended with stock footage of a car falling off a cliff.

    Yes, Toonces could drive a car. "But not very WELL."

  6. Peter Taylor said,

    April 23, 2010 @ 6:44 pm

    Does that Fourth Paradigm link work for anyone? It's the second time this week I've followed a link to that page from LL, and both times the server has been unresponsive.

  7. John Roth said,

    April 23, 2010 @ 7:04 pm

    What I got out of this is something a bit different. It's an approach that's used in a number of large web sites, and recommended by quite a few leading designers: get feedback on what works from how the customers who are trying to use your site actually behave, dammit!

    On a large enough site, you don't have to try one approach at a time. You can try several, or several dozen, and keep track of how people react to each alternative.

    Beyond that, I'm not sure what he's recommending. Unless he's recommending automatically generating the alternatives, it's hardly new.

  8. MD said,

    April 23, 2010 @ 7:17 pm

    I do research in spoken dialogue systems, and I don't see my work dying any time soon ;-) The main rule of statistics/machine learning is "Garbage In, Garbage Out". There is no "just" in annotating for semantic representations. An annotation project that can provide reliable data for machine learning will cost a lot of money to run, and requires supervision of people who actually understand how systems work (= Interaction Designers).

    And there is your basic chicken and egg problem: you can optimize by recording lots of calls - but only if you have a reasonably working system first to provide contending choices. Otherwise you are going to alienate lots and lots of customers with bad choices to start with. You can theoretically "pre-optimize" by building simulated users, but your optimization will still be just as good as your simulated user is, which again requires someone who understand how systems work.

    So, tools may change and type of work will change, but I think the news of my death are premature.

  9. Mel Nicholson said,

    April 23, 2010 @ 10:18 pm

    I've heard this joke before with other punchlines…

    "Because of computers, we will have a paperless office."

    "Email will eliminate all that useless junk mail you've been bothered by."

    "Internet related technology should pan out in about ten years, after which we won't need so many programmers."

    "With the invention of the Atomic Bomb, war will become unthinkable."
    "With the invention of dynamite, war will be impossible."
    "The crossbow should spell an end to war because armor is now useless."

    I can personally attest to hearing statements with the same intent as all but the last two. They all have the same blind spot for the fact that new solutions breed new problems.

  10. Okko said,

    April 24, 2010 @ 6:38 am

    Article reads: "Instead of carefully tweaking rule-based grammars, user dictionaries, and confidence thresholds, there is a lazy but high-performing recipe. One needs to systematically collect large numbers of utterances from all the contexts of a spoken dialog system, transcribe these utterances, annotate them for their semantic meaning, and train statistical language models and classifiers to replace grammars that have been used in these recognition contexts before."

    But this recipe is precisely what speech scientists do, and it's usually more, not less time-consuming than "carefully tweaking rule-based grammars"…

  11. Aaron Davies said,

    April 25, 2010 @ 12:31 am

    i think they're talking about googlish-engineering, where alternative/new features are automatically tested on random subsets of users to breed the best possible combination.

  12. Okko said,

    April 25, 2010 @ 11:14 am

    Sure, but a. the necessary annotation process is far costlier than a bit of rule tweaking (automated or not), and b. speech scientist/interaction designers have (for years) been employing statistical methods for improving speech applications, with various degree of (un-)supervision.

    Most importantly, the data can't speak for itself without a model, the creation of which has, in essence, been at the core of speech science and user interface job descriptions.

    Maybe the automatic generation and evaluation of tuning parameters provides some novel tools to the speech science/user interface toolkit (replacing some of the "gut feeling" approach criticized), but I don't see any mass firings coming up. The sorry state of some some speech apps out there makes me thing we'll need more of them.

RSS feed for comments on this post · TrackBack URI

Leave a Comment