{"id":307628,"date":"2007-05-08T08:00:53","date_gmt":"2007-05-08T15:00:53","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=307628"},"modified":"2016-10-18T23:34:02","modified_gmt":"2016-10-19T06:34:02","slug":"microsoft-research-unveils-technologies-improve-web-experience","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/microsoft-research-unveils-technologies-improve-web-experience\/","title":{"rendered":"Microsoft Research Unveils Technologies to Improve the Web Experience"},"content":{"rendered":"

By Rob Knies, Managing Editor, Microsoft Research<\/em><\/p>\n

Battling search spam. Streamlining Web-page monitoring. Helping protect online privacy. Enabling the illiterate to use computers.<\/p>\n

These are just a few of the ways Microsoft Research is demonstrating its commitment to making the Internet a more secure, easily searchable, user-friendly destination for consumers worldwide.<\/p>\n

\"www\"Each of those goals was featured May 8-12 during the 16th International World Wide Web Conference (opens in new tab)<\/span><\/a> (WWW 2007), to be held at the Fairmont Banff Springs Hotel, located in Alberta\u2019s Banff National Park.<\/p>\n

The conference, which attracts innovators, decision-makers, technologists, businesses, and standards bodies from around the globe, is an annual gathering to discuss the future of the Web. And, as is customary, Microsoft Research was fully invested in supporting those efforts.<\/p>\n

In its five labs worldwide, Microsoft Research undertakes a wide variety of projects designed to enhance the value of the World Wide Web, in areas as diverse as security, search, user interfaces, data mining, and technology for emerging markets.<\/p>\n

Of 111 papers accepted for the conference, 16\u201414 percent\u2014were submitted by Microsoft Research (opens in new tab)<\/span><\/a>, the most of any single organization represented at the event. Four of Microsoft Research\u2019s five worldwide labs had papers accepted, and one of the papers\u2014Wherefore Art Thou R3579? Anonymized Social Networks, Hidden Patterns, and Structural Steganography<\/em> (opens in new tab)<\/span><\/a>, co-authored by Lars Backstrom and Jon Kleinberg of Cornell University in collaboration with Cynthia Dwork (opens in new tab)<\/span><\/a>, a principal researcher for Microsoft Research Silicon Valley\u2014received the conference\u2019s Best Paper Award.<\/p>\n

Bill Buxton (opens in new tab)<\/span><\/a>, Microsoft Research principal researcher, served as a plenary speaker on May 11, delivering a commentary on social networking and Web communities entitled Design for the World Narrow Web<\/em> (opens in new tab)<\/span><\/a>.<\/p>\n

He was hardly alone. Colleague Kentaro Toyama, assistant managing director of Microsoft Research India (opens in new tab)<\/span><\/a>, participated in a panel discussion on Web Delivery Models for Developing Regions<\/em> (opens in new tab)<\/span><\/a>. Susan Dumais (opens in new tab)<\/span><\/a>, principal researcher for Microsoft Research Redmond (opens in new tab)<\/span><\/a>, also served as a panelist, on the topic of Searching Personal Content<\/em> (opens in new tab)<\/span><\/a>.<\/p>\n

A workshop on Adversarial Information Retrieval on the Web<\/em> (opens in new tab)<\/span><\/a> included participation by Microsoft Research\u2019s Krysta Svore (opens in new tab)<\/span><\/a>, Qiang Wu, and Chris J.C. Burges (opens in new tab)<\/span><\/a>, along with Microsoft\u2019s Aaswath Raman, authors of the paper Improving Web Spam Classification using Rank-Time Features<\/em> (opens in new tab)<\/span><\/a>. Another paper delivered as part of that workshop was Transductive Link Spam Detection<\/em> (opens in new tab)<\/span><\/a>, written by Burges, colleague Dengyong Zhou (opens in new tab)<\/span><\/a>, and Microsoft\u2019s Tao Tao.<\/p>\n

Marc Najork, principal researcher for Microsoft Research Silicon Valley, served as track chair for the Tutorials and Workshops committee. Toyama was deputy chair for the Technology for Developing Regions committee, and Xing Xie (opens in new tab)<\/span><\/a>, lead researcher for Microsoft Research Asia (opens in new tab)<\/span><\/a>, was the deputy chair for the Browsers and User Interfaces committee. No fewer than a dozen other Microsoft Research representatives participated as members of various WWW 2007 committees.<\/p>\n

Such conference support will be further in evidence in 2008, when the event will be held in Beijing. Hsiao-Wuen Hon (opens in new tab)<\/span><\/a>, principal researcher and deputy managing director for Microsoft Research Asia, will be the vice general chair for WWW 2008, and Wei-Ying Ma (opens in new tab)<\/span><\/a>, principal researcher and research manager for the same lab, will be a program chair.<\/p>\n

Collaboration, as always, was a hallmark of Microsoft Research\u2019s participation in WWW 2007. Of the 16 papers accepted from the organization, 10 of them featured co-authorships with academic colleagues, representing 12 universities from around the world. Microsoft Research also contributed five poster papers to the conference, and four of those represented collaboration with academic partners.<\/p>\n

Stopping Search Spam<\/h2>\n

Among those academic collaborations was a paper entitled Spam Double-Funnel: Connecting Web Spammers with Advertisers<\/em> (opens in new tab)<\/span><\/a>, part of the conference\u2019s Industrial Practice and Experience track. The paper was co-written by Yi-Min Wang (opens in new tab)<\/span><\/a>, principal researcher of Microsoft Research Redmond\u2019s Cybersecurity and Systems Management research group; Ming Ma, a research software-design engineer in the same group; and Yuan Niu and Hao Chen of the University of California, Davis.<\/p>\n

\u201cOur goal is to provide visibility into the complicated structure of the search-spam industry,\u201d Wang says, \u201cto educate the user community and the search industry on how search spammers operate and to suggest how good guys can work together to win the war against the bad guys.\u201d<\/p>\n

Search spammers use questionable search-engine-optimization techniques to promote low-quality Web pages into top search results, Wang explains. These attempts waste the time of users, who are conned into visiting junk pages before finding one with useful content.<\/p>\n

\u201cIn contrast with the common approach to search spam by merely detecting and blacklisting spam pages,\u201d Wang says, \u201cour study pursues a new, \u2018follow the money\u2019 strategy by identifying the actual companies and individuals who are involved in the search-spam industry to make money.<\/p>\n

\u201cWe show that a large part of the search-spam industry is based on advertising syndication, and it can be modeled as a double funnel with five layers. We expose the major players at each level and suggest a more effective anti-spam approach by attacking the bottleneck.\u201d<\/p>\n

Consolidating Web Updates<\/h2>\n

Another way to assist Web users is to make it easier for them to monitor pages they have identified as personally useful. This is the idea behind Homepage Live: Automatic Block Tracing for Web Personalization<\/em> (opens in new tab)<\/span><\/a>, a WWW 2007 paper co-written by Jie Han, Dingyi Han, and Yong Yu, of Shanghai Jiao Tong University, along with Chenxi Lin, Hua-Jun Zeng, and Zheng Chen of Microsoft Research Asia, to be delivered as part of the Personalization session of the conference\u2019s Browsers and User Interfaces track.<\/p>\n

\u201cWe want to enable Web users to mark blocks in Web pages and trace this block through the life of the Web page,\u201d Chen says. \u201cOur application allows users the freedom to virtually mark any block within a Web page and automatically trace the blocks when the pages change.\u201d<\/p>\n

The Homepage Live project works like this: A user selects a section of a Web page to track, and a technique called block tracing keeps that selection updated as the page is updated. The user can collect a number of sections of his or her favorite Web pages and assemble those sections on a customized page, thereby keeping abreast of pertinent information as it is updated.<\/p>\n

\u201cOur application can enhance the Web experience for users,\u201d Chen explains, \u201cby making browsing more efficient. Users no longer need to visit their favorite Web pages repeatedly. They can just mark blocks within their favorite Web pages and organize those blocks into a single page. With those simple steps, users will be able to follow all their favorite Web pages from a single page.\u201d<\/p>\n

Helping Protect Privacy on Social Networks<\/h2>\n

Then there is the winner of the WWW 2007 Best Paper Award, Wherefore Art Thou R3579? Anonymized Social Networks, Hidden Patterns, and Structural Steganography<\/em> (opens in new tab)<\/span><\/a>, part of the WWW 2007 Data Mining track\u2019s Mining in Social Networks session.<\/p>\n

The paper\u2019s amusing title masks a serious concern. Some social-network sites on the Web have suggested anonymization of the communications within those networks. Dwork and her Cornell colleagues argue that such efforts would destroy the privacy of participants.<\/p>\n

\u201cWe described two attacks, one active, one passive,\u201d Dwork says. \u201cThe heart of both attacks is to create a small structure in the communication graph that can be recognized. This structure corresponds to a small subgraph, where each vertex is a user account and an edge between vertices indicates communication between the two user accounts.<\/p>\n

\u201cOnce an attacker has located the structure, she or he can find the connection pattern between any two accounts that are both connected to the structure. For example, a small group of friends can together find out whether Alice and Bob, each of whom is linked to the small group, are in communication with one another.\u201d<\/p>\n

Such discoveries could wreak havoc on the implied trust social networks seem to offer.<\/p>\n

\u201cOur project,\u201d Dwork says, \u201cis on privacy-preserving analysis of data. The goal is to enable site hosts to reveal interesting information about the social-networking graph hosted on their computers without compromising privacy.\u201d<\/p>\n

Enabling Non-Readers to Use Computers<\/h2>\n

Microsoft Research India has been pursuing intriguing work on enabling illiterate or semi-literate persons to make effective use of PCs. A paper by Indrani Medhi, Archana Prasad, and Toyama of Microsoft Research India\u2014called Optimal Audio-Visual Representations for Illiterate Users of Computers<\/em> (opens in new tab)<\/span><\/a>, part of the Communication in Developing Regions session of the conference\u2019s Technology for Developing Regions track\u2014marks the latest step in the lab\u2019s research.<\/p>\n

\u201cWe wanted to find out what was the most comprehensible way to represent concepts to a non-literate person,\u201d Medhi explains. \u201cThe project was a careful study comparing a variety of different representational types.<\/p>\n

\u201cWe tested how health symptoms could best be represented, with a subject group of 200 illiterate people. For each, we randomly selected one representation from among 10: text, static drawings, static photographs, hand-drawn animations, and video, each with and without voice annotation.\u201d<\/p>\n

The results of the study were interesting:<\/p>\n