Is e-mail obsolete? Far from it. We continue to gather more and more information in our inboxes: personal and professional communications, but also marketing and commercial ads, alerts and notifications from websites or social networks, search engines results, agendas, …
The NextMail’11 workshop will focus on current research and emerging trends in email research. I’m happy to be a part of the program committee for the workshop, which will be held as part of the IEEE / WIC / ACM International Conferences on Web Intelligence and Intelligent Agent Technology August 22, 2011 in Lyon, France.
You can read the full Call For Papers for all the details, but relevant topics include:
Email content analysis, information extraction, summarization
Email social networks in enterprise
Email management strategies within organizations
Adaptative email agents and semantic agents
Emails archives exploration, visualization, regulations and behaviors
Email visual interfaces and human/computer interaction with emails
Case studies, experiments and user studies on emails usages
Benchmark and email testing datasets
Interoperability over email with enterprise resources and legacy systems
Semantic email and email mining
Unified messaging and web interactions : instant messaging, RSS feeds, annotations, tagging
Personal information management integration in email clients, pending task management
Interaction between email , PIM and the mobility factor
Facing the volume growth, do we need to replace the old protocols?
Evolution of infrastructures and uses
Papers are due by 21st March 2011, so get writing!
An interesting aspect of online group communication is the phenomena of backchannel. Backchannel in computer-mediated communication (CMC) allows participants within a group conversation to exchange private communication which is visible only to the sender and receiver. Many existing forms of CMC provide such capability – think IRC, Skype and even Twitter (through direct messages).
Launched 5 months ago Subtextual (until recently, known as bccthis) is an interesting plug-in for Microsoft Outlook that allows the mixing of public (visible to all recipients) and private (visible only to specific recipients) content within a single email message. This allows a sender to send a single message, but add private context addressed to only those people that need it.
Subtextual adds the ability to send a hidden message as part of a normal email message. This hidden content is visible only to selected message recipients – other recipients never see any indication that the message has any additional content. Happily, recipients don’t need any plug-in to view Subtextual messages.
While clearly an interesting idea, I’m not sure whether Subtextual, is significant enough to be more than just another feature for Outlook. I am, however, impressed with their family of products across desktop, mobile and web-based email. Together with their recently announced premium version of the Outlook plug-in, it feels like the company is busy experimenting, trying to discover the platforms which can deliver them traction, customers and revenue. I am very interested to see in which direction this company will pivot in the future.
In work reminiscent of their original ReMail work, but targeted at mobile email, IBM is rethinking mobile email. Their focus is on fast email triage on mobile devices, including how to capture intended actions, such as those that might be actioned on the desktop at a later time (rather than on the mobile device).
While it’s widely acknowledged that desktop email clients have been slow to adapt to changing volumes and styles of email use, the problem is arguably more acute in the mobile space. For starters, obviously the device form factors influence how people use mobile email – you’re not likely to see people typing long-winded messages with their thumbs – yet many mobile email clients are essentially designed as smaller versions of desktop email clients. Mobile email users typically focus on triaging their messages to determine what’s new, what they can delete right away, and what’s important enough to handle immediately. They often defer everything else until they are at a desktop or laptop with a full keyboard and larger display.
I think it’s worth spending 7 minutes or so to watch the video below, where Jeff Pierce outlines the project:
A quick note about a new feature in the email client on the iPhone in the latest iOS4 release. When you receive an email with a date or time mentioned in it, Apple’s email client automatically detects the date, and presents it as an underlined hyperlink. Clicking the date then creates an event in your Calendar on the date/time that was recognised in the text. Apparently it defaults to using the email’s subject line as the event title. (As an aside, it’s also worth noting that Gmail has had similar functionality since around 2006.)
I’m guessing Apple has rolled-their-own date recognition code, probably using simple rules or regular expressions. Does anyone know more about the technology behind this feature?
(Hat tip to Rob Tot for alerting me to this functionality)
In the early days of email, widely-used conventions for indicating quoted reply content and email signatures made it easy to segment email messages into their functional parts. Today, the explosion of different email formats and styles, coupled with the ad hoc ways in which people vary the structure and layout of their messages, means that simple techniques for identifying quoted replies that used to yield 95% accuracy now find less than 10% of such content.
Many language processing and search tools stand to benefit from better knowledge of the different functional parts of email messages, since this would allow them to focus on relevant content in specific parts of a message. In particular, access to zone information would allow email classification, summarisation and analysis tools to separate or filter out ‘noise’ and focus on the content in specific zones of a message that are relevant to the application at hand. Email contact mining tools, for example, might only access content from the email signature, while tools that attempt to identify tasks or action items in email might restrict themselves to the sender-authored and forwarded content.
Last week, I presented my paper on Segmenting Email Message Text into Zones at the Empirical Methods in Natural Language Processing (EMNLP) conference in Singapore. The focus of this work is Zebra, an SVM-based system that automatically segments and classifies the body text of email messages into nine functional zone types based on graphic, orthographic and lexical cues.
Our set of nine zones includes the following: author, greeting, signoff, quoted reply, forward, signature, advertising, disclaimer and attachment. Zebra currently performs the segmentation and classification of email text into the nine zones with an accuracy of about 87%. When the number of zones is abstracted to two or three zone classes (which is much more likely to be the granularity required for real-world email processing tasks), Zebra’s accuracy increases above 91.5%.
I’m currently working to finish off the Zebra system, as well as to resolve some licensing issues so that the code can be released for other researchers to use. We have, however, already released our annotated email dataset consisting of almost 12,000 lines of annotated email text that we used to train the Zebra system. If you want to know more, you can read our paper, head over to the Zebra website, or just get in touch with me by email or other means.
Back in October 2005, the Office of Administration in the (Bush Administration) White House allegedly discovered that the Executive Office of the President had lost millions of White House emails between 2003 and 2005. In April 2007, CREW filed a Freedom-of-Information-Act request to the Office of Administration asking for information about the missing emails. CREW sought records about the EOP’s e-mail management system,reports analyzing potential problems with the system, records of retained emails and possibly missing ones, documents discussing plans to find the missing emails, and proposals to institute a new email record system.
Sadly for CREW, the latest ruling finds that the Office of Administration is not an “agency” under the terms of the Freedom-of-Information-Act, and thus need not comply with CREW’s request to provide information about the “misplaced” emails.
Of course, there are other cases still moving through the courts between the Executive Office of the President, CREW and other parties. And, thanks to earlier lawsuits in the 1990s, email from the White House must be treated and preserved as government records. For more information about the “lost” Bush Administration emails, the National Security Archive at George Washington University has a comprehensive chronology of the saga.
(Hat-tip to Roger Matus for alerting me to the ruling)
About 5 years ago, during my Masters studies, I wrote some simple speech applications using Java Speech API (JSAPI) 1.0 compliant speech engines. At the time, the JSR for JSAPI 2.0 was well underway. Well, it’s taken more than 8 years since the formation of the JSR, but *finally* the final release of the Java Speech API (JSAPI) 2.0 specification has been made available, released on 7th May 2009.
Of note, JSAPI 2.0 is now primarily aimed at the Java ME platform (specifically CLDC 1.0 and MIDP 1.0), meaning that it’s hoped the new spec will facilitate speech-enabled java applications on mobile devices. For this reason, gone are all floating point references and dependencies on AWT (yay!). Recognition Engines may provide full support for application-defined grammars or provide more limited support through specialized built-in grammars. Synthesis Engines may support full text-to-speech capabilities or simple text and audio sequencing. According to documentation in the spec, implementations can require 0.5-1.5 MBytes of ROM for models and algorithms and approximately 128 KBytes of RAM depending on vocabulary and grammar size. Of course, JSAPI 2.0 compliant engines can still run on Java SE platforms, and can obviously make good use of more substantial memory and processing resources.
“We think that the API is well designed and has very comprehensive functions. However, it is therefore highly complex and requires fairly advanced speech recognition and synthesis features. It also assumes a high level of speech recognition understanding from the application developer. It might not be feasible in many Java ME devices in the near term, but can provide good features in those high end platforms where applicable.”
All in all, while it has taken a long time to come to fruition, I’m very pleased to see the JSAPI 2.0 standard finalised. Of course, given that JSAPI is only a specification (not an implementation) it remains to be seen how quickly the various speech recognition and speech synthesis systems move to support the new and modified APIs.
One of the main aims of this workshop is to gather email and enterprise computing researchers and practitioners to discuss and propose solutions for email in e-commerce and enterprise contexts.
Topics:
Architecture for enterprise cooperation and interoperability over email
Intelligent email for SMEs
Email-based business task and process management
Email content analysis, message summarization, information extraction
Semantic Email and Semantic Knowledge Extraction
Email social networks for enterprise computing
Email analysis of exchanged documents for semantic alignment via negotiation
Email Workflow Management for Business Processes
Interconnection of email content and enterprise resources (legacy systems, document repositories)
Enterprise resource mashup support for business email
Approaches for email visualization and user interfaces in business contexts
Case studies
Business email datasets
If you’re a researcher working with email, or if your startup or company is in the email space, please consider submitting a paper or demo to the workshop. Full details are available in the Call for Papers.
The focus seems to be on lightweight interaction, which is definitely the right approach. To add a new task, for example, you just click in an empty part of the task list and start typing. This seems pretty similar to the style of task interaction pioneered by Remember the Milk, and I’d be interested to know how it compares with RTM’s GMail services, particularly their recently announced RTM GMail gadget that can be added via GMail Labs. Are there any users out there who have experimented with RTM’s tools and can offer insight on the comparative strengths and weaknesses of the new Labs task addition?
There doesn’t seem to be much in the way of tight interaction between email and tasks (yet), but I’m sure this will be in the pipeline for future enhancements.
On the topic of tasks in email, if you’re interested in learning more about how people phrase tasks in email messages, have a look at my recent paper, Requests and Commitments in Email are Complex: Eight Reasons to be Cautious, which I presented at the Australasian Language Technology conference in Hobart earlier this week.
A series of email messages from the controversial Yahoo! Mail account of US Republican vice-presidential candidate Sarah Palin were leakedontotheInternet today.
As with the recently announced Venezuelan government email leak, Wikileaks was again in the scrum, issuing the following press release:
The internet activist group ‘anonymous’, famed for its exposure of unethical behavior by the Scientology cult, has now gone after the Alaskan governor and republican Vice-Presidential candidate Sarah Palin.
At around midnight last night the group gained access to governor Palin’s email account … and handed over the contents to the government sunshine site Wikileaks.org.
Governor Palin has come under media criticism in the past week for using pseudo-private email accounts to avoid Alaskan freedom of information laws.
The zip archive made available by Wikileaks contains screen shots of Palin’s inbox, two example emails, governor Palin’s address box and a couple of family photos. While the emails released so far reveal little, the list of correspondence appears to re-enforce the criticism that Palin is mixing governmental and personal affairs.
The emails quoted in press articles to date seem to show that Palin has improperly used her private email account to conduct government business, thereby avoiding archiving requirements and shielding herself and her government from public scrutiny. It is unclear what if any action will be taken in response. According to the Sydney Morning Herald, the Secret Service contacted The Associated Press and asked for copies of the leaked emails on her Yahoo! account, but AP did not comply.
The Palin email leak is the latest in a string of unauthorised email disclosures. Ironically, it comes almost a year to the day after the MediaDefender email leak. Clearly, our recent discussion about the ethics of email corpora on the email research mailing list is a timely one!