Google adds Task List to GMail Labs
Thursday December 11th 2008, 11:17 am
Filed under: email, information delivery, language technology, technology
Posted by: Andrew Lampert

I’ve been traveling for the past couple of weeks, so missed the announcement of Tasks as a new feature in GMail Labs. Given my own interests in tasks in email, this seems to be the most useful Labs feature to surface so far. Also of interest are the nearly 500 threads discussing ideas for future enhancements to the Tasks plugin.

Tasks for Gmail Labs

The focus seems to be on lightweight interaction, which is definitely the right approach. To add a new task, for example, you just click in an empty part of the task list and start typing. This seems pretty similar to the style of task interaction pioneered by Remember the Milk, and I’d be interested to know how it compares with RTM’s GMail services, particularly their recently announced RTM GMail gadget that can be added via GMail Labs. Are there any users out there who have experimented with RTM’s tools and can offer insight on the comparative strengths and weaknesses of the new Labs task addition?

There doesn’t seem to be much in the way of tight interaction between email and tasks (yet), but I’m sure this will be in the pipeline for future enhancements.

On the topic of tasks in email, if you’re interested in learning more about how people phrase tasks in email messages, have a look at my recent paper, Requests and Commitments in Email are Complex: Eight Reasons to be Cautious, which I presented at the Australasian Language Technology conference in Hobart earlier this week.



Promoting Human Communication Science
Thursday August 07th 2008, 1:01 pm
Filed under: information delivery, language technology, research, science, search
Posted by: Andrew Lampert

A few months back, I had a conversation about my PhD work with Kate Stevens, one of the members of the executive for HCSNet, an Australian Research Council funded collaboration network for researchers working on topics in the broad space of Human Communication Science.

Parts of my on-camera conversation with Kate have made it into the recently released HCSNet Promotional video, which is now available on YouTube. It’s always a bit weird seeing yourself on camera, particularly when sound bytes are taken from a much longer conversation! Given the totally unscripted nature of what was recorded though, I think it’s worked out quite well.

Of course, this is also a good opportunity to actually plug the annual HCSNet Summerfest, which will be held at UNSW in Sydney in December. If you’re interested in speech, language, sonics, psychology or any number of topics in between, check out what’s on offer – it’s well worth a few days of your time to meet some inspiring people.



Do we need sentiment analysis for email?
Tuesday January 22nd 2008, 12:32 pm
Filed under: email, information delivery, language technology, research, technology
Posted by: Andrew Lampert

Brij Singh at MessageDance has posted an interesting motivation for applying sentiment analysis to incoming email. He asks whether the sentiment evoked by incoming email results in cognitive turnover for knowledge workers, thus disrupting their productivity.

Brij thinks that the application of sentiment analysis to email could help address this mental wandering for knowledge-based employees:

I think it’s high time for companies to invest in sentiment classification and routing toxic emails to platform where immediate impact on employee productivity is less. Can carefully controlled social platform enable this process?

Having just yesterday attended a research presentation by Mary Gardiner on sentiment classification, it’s interesting to consider the possibilities and practicalities of applying the sentiment classification techniques to email.

One unsupervised technique, pioneered by Turney and Littman, is to use pointwise mutual information (PMI) and word co-occurrence counts from a search engine to help determine the valence of each word in a text. Turney and Littman used the NEAR operator in Altavista to determine the co-occurrence of each word in their text to be classified (in our case, this would be each word from an incoming email message) with each word from a set of words with known positive or negative valence. The counts for co-occurrence with the known-positive words contribute to the positive sentiment of our unclassified word, while counts for co-occurrence with negative words contribute to the negative sentiment. These co-occurrence counts are then normalised and combined to determine the overall valence of each word from our unclassified text. The technique, though simple, worked surprisingly well (80% classification accuracy at the word level), much better than many more complex techniques.

Ignoring the sad reality that the NEAR operator is no longer available to use in Altavista queries (and that no other search engines offer an operator of similar functionality in their public query interface), it’s interesting to think about whether such a technique could be usefully applied to email. I don’t know if people have addressed how to move from word-level classification up to message-level sentiment classification, but it doesn’t seem to be an insurmountable problem.

More of an issue for email is whether people would be happy for the entire text of their email messages to be sent in clear text to a single search provider. Depending on the volume and nature of data on a user’s own machine, perhaps we could use the desktop search interface to approximate Turney and Littman’s technique, without passing sensitive email data out onto the network? Of course, there’s a big difference in the scale of corpus being used to generate the co-occurrence counts in this case – Altavista at the time of the experiment, claimed to be indexing around 100 billion words. My desktop search index claims to contain about 1.5 million items (email messages, documents, visited web pages etc.) . While that’s not going to get us to 100 billion words, it might be enough to get some credible results?



Email Companies
Wednesday November 28th 2007, 11:49 am
Filed under: email, information delivery, technology
Posted by: Andrew Lampert

There’s a nice overview article in the Wall Street Journal of companies working on the email overload problem. Deva Hazarika, CEO of ClearContext, presents some thoughts on the article, with reference to how ClearContext are tackling the identified email problems.

For people interested in Xobni’s work on displaying person profiles directly in people’s mail client, they’ve recently added further functionality to their Outlook plugin.



Xobni Looking for Latent Structure in Email
Friday October 19th 2007, 4:41 pm
Filed under: email, information delivery, language technology, research, search, technology
Posted by: Andrew Lampert

Email seems to be a flavour of the moment, and Chris Morrison continues the trend over at VentureBeat with a short but informative write-up of four startups innovating around email.

Fuser and Orgoo both focus on the integrated/universal messaging client, bringing IM, social networks and other communication mediums into a single client along-side email. Xoopit is still in stealth-mode, so they haven’t revealed much publicly about the details of their work, but their focus appears to be on extracting and compiling collections of attached documents, images etc. from email archives. More interesting to me is Xobni, who I’ve been following with some interest since Vitor Carvalho brought the company to my attention a few weeks ago.

Chris Morrison notes that while Xobni already pulls out some information like phone numbers from email, there’s much more information waiting for someone to find an innovative way to highlight. Of course, highlighting is only one option for making such structure available and useful for end users. Matt Brezina, co-founder of Xobni, also comments about the latent, untapped structure in email:

“There’s a structure that just hasn’t been broken apart and exposed”
Matt Brezina – Co-Founder Xobni

I think Matt is right on target with this assessment. It’ll be interesting to see which avenues of structure they pursue. I have my own ideas on important latent structure in email, some of which you can hopefully read about in an upcoming conference paper. More details coming if and when the paper actually gets accepted.



Research Seminar Podcast
Monday October 09th 2006, 9:02 pm
Filed under: csiro, information delivery, language technology, research, science, search, technology, usability
Posted by: Andrew Lampert

So I’ve taken the plunge and created my first podcast which is also available through iTunes. Don’t be afraid though – you won’t hear much from me except the occasional speaker introduction – it’s a podcast of recorded seminars from the research seminar series that I’ve been jointly running with Cecile Paris at the CSIRO ICT Centre for the past 5 years. The seminar series itself pre-dates my time at CSIRO however – 2006 is its 10th consecutive year!

Anyway, if you’re at all interested in human factors, artificial intelligence or language technology, take a moment to tune in – we have some excellent talks coming up in the near future. As you can see from our collection of past seminars, topics range widely including research and applications in usability, human-computer interaction, user modelling/personalisation, novel interfaces, natural language processing, linguistics, information retrieval, speech processing, system evaluation, computer supported cooperative work, cognitive science and more.



Using Context to Deliver Useful Information to People
Tuesday September 05th 2006, 10:31 pm
Filed under: csiro, information delivery, language technology, research, science, search, technology, usability
Posted by: Andrew Lampert

As Mitch Kapor, founder of Lotus Development Corporation, once said, Getting information off the Internet is like taking a drink from a fire hydrant.

On September 19th, I will be presenting a seminar to the NSW branch of CHISIG - the Computer-Human Interaction Special Interest Group of Australia – about our research in CSIRO that focuses on controlling the flow of information to deliver the right content to the right people at the right time in the right form.

Our research approaches the problem by using knowledge about users and their interaction to tailor the information that is gathered and to present it appropriately. The context information that is captured and reasoned about can include user preferences and characteristics, as well as details of a user’s current task, their previous history of interaction and their environment. This context can determine which information should be retrieved, and how that content should be aggregated, organised, and presented, in order to best support the user.

My presentation will cover work that builds on concepts and techniques from a variety of different fields, including: natural language generation, information extraction, information retrieval, discourse analysis, user modelling, task analysis and HCI, so if any of those topics spark interest (and you happen to be in Sydney) you might consider coming along to PTG Global on Tuesday 19th.


CeBIT Australia 2006
Thursday May 11th 2006, 11:22 pm
Filed under: csiro, information delivery, language technology, research, science, search, technology
Posted by: Andrew Lampert

After returning from leave, I was immediately immersed in last minute preparations for CeBIT Australia 2006. After spending much of Monday afternoon assisting with the construction and setup of the CSIRO stand, I then spent 2 days this week at CeBIT show-casing ICT research from across a range of CSIRO divisions. Our main demonstration was again SciFly, our tailored brochure generation system – with much improved robustness and performance from last year. I had several interesting discussions with interested people about applying SciFly and the underlying technology to a range of problems across a variety of industries. For me, this was the most satisfying success metric of my time at CeBIT.

As well as demonstrating our technology at the CSIRO booth, I also gave a short seminar on Contextualised Information Retrieval and Delivery as part of the Future Parc seminar series. The environment was a challenging one for speakers, with much background noise, unreadably small plasma screens for displaying slides, and no less than 6 parallel sessions of seminars at various points around CeBIT to compete with. Despite this, I think I managed to engage at least some of the people in the audience, based on the couple of thoughtful follow-up discussions that I had after the seminar.



Activities and Tasks in Emails
Friday February 03rd 2006, 4:45 pm
Filed under: email, information delivery, language technology, research, science, technology
Posted by: Andrew Lampert

So I was busy at the International Conference on Intelligent User Interfaces conference earlier this week, and it was a hugely motivating and thought provoking experience. A great bunch of really switched on people doing all kinds of interesting things.

One presentation that particularly caught my interest was from Tessa Lau at IBM Almaden Research lab – not surprising really, given that I’ve read about some of Tessa’s previous work in email management. The work she presented at IUI was on IBM’s Unified Activity Management project (UAM). In that context, one of her points that really rang true for me was about the need to move away from being focused on tools to focus more on the activities people perform when dealing with information management. This should, of course, lead to the development of software applications that do a better job of supporting users, who are (and should be!) more concerned about their tasks and activities than about which tool they used to do what, and how they can integrate work that happens to have been performed using different software tools.

As a simple example, rather than grouping email messages for a given activity in an email client and the Excel documents in a separate folder on the filesystem, can we instead cluster all relevant information together based on the activity which ties the various artifacts together, rather than based on the tool that happened to have been used to create them. Accordingly, a major part of the UAM project is focussed on integrating email content into an overall activity management system that is under development. To do so requires an ability to associate email content with new or existing activities. Obviously, for new activities, this requires a light-weight and simple way of creating activities from email, and of displaying email in the context of existing activities.

When trying to associate incoming email messages with new and existing activities, the IBM team seems to have been inspired by the information retrieval community in using recommendation rather than all-or-nothing mapping of incoming messages to activities. This is a clever way of reducing the likelihood of frustrating users with incorrect categorisations, and is indeed the approach we took in earlier email categorisation work I have been involved with a few years ago at CSIRO.

Tessa also referred to email signatures as ‘noise’ that, by implication, needs to be removed to recover the communication signal conveyed by email – a very simple and logical description of the nature of email signatures (and often quoted material) in the context of automatic processing of emails.

Some weaknesses of the work presented included an implicit assumption that a single email message should be associated with only zero or one activity. Clearly this suffers from a multiple-inheritance style problem – in practice a single email message can often contain content that is relevant to many different activities. In the present system it is not possible to apply multiple activity labels to a single message. This, of course, sounds a lot like the folder vs. labelling problem that has been all the range since GMail appeared on the radar.

Another interesting question is whether classifying email messages into activities is different from the classification of emails into folders (which is a well studied text categorisation problem). There certainly seem to be many similarities between both problems. Perhaps there is a difference of focus (folder classification generally being for archiving, and activity classification more for current work), but this is purely speculation.

Of particular interest for me was that Tessa identified speech act detection in email as a future direction for their research. This is both motivating, given that smart people see some similar value in the kinds of ideas I’m playing with, but also rather intimidating to think who my competitors out there in the research world include!! I think I’d better get a move on with my own research!



R&D Software Engineer Wanted
Friday January 13th 2006, 9:22 am
Filed under: csiro, information delivery, java, language technology, research, science, search, technology, usability
Posted by: Andrew Lampert

Ok, so if you’re a software engineer looking for new challenges in 2006, here’s a great opportunity for you. My research team within the CSIRO ICT Centre (the Information Delivery team) is seeking to recruit a highly competent, motivated, and energetic software engineer to our Sydney laboratory.

You will contribute to software engineering, R&D and commercialisation activities within our small but highly productive team carrying out leading-edge research in the area of information engineering and the development of advanced search and delivery technology. This role will have a particular focus on mobile phone and PDA technology.

A degree in Software Engineering or a related discipline is essential; an honours degree or higher qualification would be an advantage, but not essential.

We need you to demonstrate excellent programming expertise in at least Java (preferably other languages too), familiarity with Web services, and preferably have exposure to mobile phone or PDA software development platforms. The development
projects underway need you to work on both research prototypes and on commercial products. Your willingness to provide technical support, an ability to write high quality documentation, and a capacity to talk to customers are important.

Finally, you should enjoy working in teams, be honest, trustworthy, and ethical, with an ability to contribute creative ideas to our projects.

Reference Number: 2006/63
Position Title: Software Engineer – Information Delivery
Division: CSIRO ICT Centre
Location: North Ryde, NSW
Classification: CSOF4 to CSOF5
Salary Range: $58k – $72k + superannuation
Tenure: 12 month term
Applicants: International Applicants Welcome
Relocation Assistance: May be offered to the successful applicant.
Applications Close: 27 Jan 2006
Job Category: Computer Software/Scientific Research

For further details, selection criteria and to apply for this position, please visit: http://recruitment.csiro.au/asp/job_details.asp?RefNo=2006/63

If you have any questions about this position, please post a comment here, or feel free to email me (Andrew.Lampert@csiro.au).