SIGDial – Day 2
So I had an excellent time at day 2 of SIGDial, although I unfortunately missed the last couple of sessions due to a clash with the Discourse Annotation tutorial I attended. On that note, it seems strange (but presumably unavoidable) that such closely aligned sessions should clash.
Heard some interesting talks that I won’t have time to really do justice to in summarizing them here (I’m sitting in the corridor at the main Coling-ACL conference, taking advantage of a break in sessions). Most interesting work for me included:
- Work using GraphBank – which looks at discourse structure as graph-based rather than tree based. This allows non-local (i.e., long distance) discourse links to be modelled, which is sometimes an advantage in real discourse. GraphBank, while an interesting idea, is not without its problems however – one specific issue is that it seems to conflate some relations, in particular actual causation with intention or purpose, which can lead to some strage annotations.
- Work from Tilburg University on (yet another) dialogue act taxonomy, called DIT++. How is it different from something like DAMSL? This isn’t entirely clear, but perhaps a point of differentiation seems to be a more elaborate and fine-grained set of feedback functions and dialogue control aspects. In general, given that it is a multi-dimensional annotation scheme, there are the usual problems with inter-annotator agreement. To attempt to improve their evaluation scores, the particular work presented looked at developing evaluation metrics that better model the performance of such hierarchical schemes, where coarse-grained agreement is usually ignored if fine-grained disagreement occurs (e.g., using kappa as a measure of agreement). Unfortunately, that actual weighted metrics proposed seemed rather preliminary and arbitrary, though there is clearly a need for such work.
- A high-level presentation from David Traum on work with Question Answering characters. The main message I took out of his talk was a desire to define the ’science’ of content creation across different modalities and methods (text, speech and graphics are their focus)
Otherwise, Coling/ACL has been a rather intense experience so far – just getting towards finishing day 4 of consecutive conference days, with another 5 still to go (although tomorrow is actually an excursion day). ACL is awesome fun though – a hugely impressive and inspiring group of people from all over the planet.
Discourse and Dialogue Research Workshop
The 7th SIGDial workshop is being held in Sydney this year, as part of Coling-ACL (for which I have been looking after the website, as part of the local organising committee). SigDIAL covers both discourse and dialogue research (though many people consider it a dialogue forum). Here are some highlights from day 1 of my first SiGDial workshop:
- Using user models to tailor help provided within a spoken language dialogue system in BMW cars. The work looked at how to advise users about the available options in the context of a system offering more than 350 features where 90% of users use less than 10% of the features. The novel aspect was its attempt to model users forgetting about available features as well as their learning behaviour in using and internalising options. I still question how usable (or necessary) it is to really have 350 features, let alone 700-1000, which was the prediction for the next generation of the BMW iDrive. What exactly are people controlling while they’re driving with these devices, other than the navigation system, their music and possibly a phone?
- A paper on a multi-domain Spoken Language Dialogue System, looking at an architecture with a central module directing questions to domain specific agents with domain specific language models etc. This work focused on how to correctly identify the domain for any incoming question, including using information about the dialogue history to bias certain domain choices (e.g., bias towards staying on the existing topic depending on the probabilities returned from the speech recognition component). The presenter assumed independence between domain expert agents, and discounted the case where more than one agent is appropriate, or where information from multiple agents is required, so didn’t explore any of the interesting integration or aggregation aspects – it was really more about identifying the domain of a question/utterance, with some fairly simplistic use of discourse features (namely, the discourse history).
- Maria Georgesecul gave a very interesting presentation on different algorithms for topic segmentation and different methods of evaluating their performance. She looked at the TextTiling, C99 and TextSeg algorithms, using a variety of evaluation metrics. More interestingly, she also evaluated the effect of using artificial/synthesized topic data created by concatenating fragments of documents from different domains, versus using (expensive to create) hand-annotated thematic data. Their study suggests that using synthetic data to evaluate performance of topic segmentation algorithms can give misleading results under certain conditions, which is something that has long been suspected in the field, but apparently not previously confirmed.
- The keynote from Jonathan Ginzberg looked at whether the content of an utterance in dialogue should be computed using only the grammatical information, or whether it should take account of the participant’s intention using domain-level inference. More specifically, he talked about whether grounding pertains to surface utterance content or to interlocutor intention. I was a bit disappointed overall, as I found his talk a bit opaque and hard to follow.
- Simon Keizer gave an interesting talk on multidimensional dialogue management, where he proposed a new, multi-level dialogue act annotation scheme. Disappointing for me was that he didn’t attempt to contrast or compare this with any of the existing dialogue act scheme, including DAMSL, which seemed remarkably similar. He had built a rule-based DA recognizer using mainly part-of-speech information and word-patterns (cue phrases) as features.
Some interesting posters included:
- Daniel Midgley’s work (I met Daniel at the HCSNet Summerfest last year) looking at adjacency pairs of dialogue acts in the VerbMobil corpus combined with the use of some novel discourse chunking. Also applied Chi-squared normalisation to filter out noise in the adjacency pairs, and ended up with data that empirically supports the original adjacency pairs proposed by Sacks and Schegloff back in 1973.
- Simone Teufel presented work that defines an annotation scheme for classifying sentiment in scientific citations. This allows sentiment analysis classification to be added to citation graphs, and might allow for interesting applications – see which are the controversial papers in your field. Which papers are used by many other papers as a theoretical basis? Which papers receive only positive citations?
Looking forward to day 2 tomorrow!