\u201cDomain Adaptation for Commitment Detection in Email\u201d<\/a>: bias in the datasets available to train commitment detection models.<\/p>\nResearcher access is generally limited to public corpora, which tend to be specific to the industry they\u2019re from. In this case, the team used public datasets of email from the energy company Enron and an unspecified tech startup referred to as \u201cAvocado.\u201d They found a significant disparity between models trained and evaluated on the same collection of emails and models trained on one collection and applied to another; the latter model failed to perform as well.<\/p>\n
\u201cWe want to learn transferable models,\u201d explains White. \u201cThat\u2019s the goal\u2014to learn algorithms that can be applied to problems, scenarios, and corpora that are related but different to those used during training.\u201d<\/p>\n
To accomplish this, the group turned to transfer learning, which has been effective in other scenarios where datasets aren\u2019t representative of the environments in which they\u2019ll ultimately be deployed. In their paper, the researchers train their models to remove bias by identifying and devaluing certain information using three approaches: feature-level adaptation, sample-level adaptation, and an adversarial deep learning approach that uses an autoencoder.<\/p>\n
Emails contain a variety and number of words and phrases, some more likely to be related to a commitment\u2014\u201cI will,\u201d \u201cI shall,\u201d \u201clet you know\u201d\u2014than others. In the Enron corpus, domain-specific words like \u201cEnron,\u201d \u201cgas,\u201d and \u201cenergy\u201d may be overweighted in any model trained from it. Feature-level adaptation attempts to replace or transform these domain-specific terms, or features<\/em>, with similar domain-specific features in the target domain, explains Sim. For instance, \u201cEnron\u201d might be replaced with \u201cAvocado,\u201d and \u201cenergy forecast\u201d might be replaced with a relevant tech industry term. The sample level, meanwhile, aims to elevate emails in the training dataset that resemble emails in the target domain, downgrading those that aren\u2019t very similar. So if an Enron email is \u201cAvocado-like,\u201d the researchers will give it more weight while training.<\/p>\nGeneral schema of the proposed neural autoencoder model used for commitment detection.<\/p><\/div>\n
The most novel\u2014and successful\u2014of the three techniques is the adversarial deep learning approach, which in addition to training the model to recognize commitments also<\/em> trains the model to perform poorly at distinguishing between the emails it\u2019s being trained on and the emails it will evaluate; this is the adversarial<\/em> aspect. Essentially, the network receives negative feedback when it indicates an email source, training it to be bad<\/em> at recognizing which domain a particular email comes from. This has the effect of minimizing or removing domain-specific features from the model.<\/p>\n\u201cThere\u2019s something counterintuitive to trying to train the network to be really bad at a classification problem, but it\u2019s actually the nudge that helps steer the network to do the right thing for our main classification task, which is, is this a commitment or not,\u201d says Sim.<\/p>\n
Empowering users to do more<\/h3>\n
The two papers are aligned with the greater Microsoft goal of empowering individuals to do more, tapping into an ability to be more productive in a space full of opportunity for increased efficiency.<\/p>\n
Reflecting on his own email usage, which finds him interacting with his email frequently throughout the day, White questions the cost-benefit of some of the behavior.<\/p>\n
\u201cIf you think about it rationally, it\u2019s like, \u2018Wow, this is a thing that occupies a lot of our time and attention. Do we really get the return on that investment?\u2019\u201d he says.<\/p>\n
He and other Microsoft researchers are confident they can help users feel better about the answer with the continued exploration of the tools needed to support them.<\/p>\n","protected":false},"excerpt":{"rendered":"
As email continues to be not only an important means of communication but also an official record of information and a tool for managing tasks, schedules, and collaborations, making sense of everything moving in and out of our inboxes will only get more difficult. The good news is there\u2019s a method to the madness of staying on top of your email, and Microsoft researchers are drawing on this behavior to create tools to support users. <\/p>\n","protected":false},"author":37074,"featured_media":566580,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"categories":[194455],"tags":[],"research-area":[13556,13545],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-566577","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-learning","msr-research-area-artificial-intelligence","msr-research-area-human-language-technologies","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[643845,644373],"related-projects":[],"related-events":[558867],"related-researchers":[],"msr_type":"Post","featured_image_thumbnail":"","byline":"","formattedDate":"February 8, 2019","formattedExcerpt":"As email continues to be not only an important means of communication but also an official record of information and a tool for managing tasks, schedules, and collaborations, making sense of everything moving in and out of our inboxes will only get more difficult. The…","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/566577"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/37074"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=566577"}],"version-history":[{"count":6,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/566577\/revisions"}],"predecessor-version":[{"id":698302,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/566577\/revisions\/698302"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/566580"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=566577"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=566577"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=566577"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=566577"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=566577"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=566577"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=566577"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=566577"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=566577"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=566577"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=566577"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}