{"id":303464,"date":"2012-09-26T09:00:37","date_gmt":"2012-09-26T16:00:37","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=303464"},"modified":"2016-10-11T09:02:40","modified_gmt":"2016-10-11T16:02:40","slug":"12-campaign-predicting-u-s-election","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/12-campaign-predicting-u-s-election\/","title":{"rendered":"\u201912 Campaign: Predicting the U.S. Election"},"content":{"rendered":"

By Rob Knies, Managing Editor, Microsoft Research<\/em><\/p>\n

It\u2019s a presidential election year in the United States, and that, we\u2019ve learned, means that pollsters are on the prowl. The electorate for the forthcoming balloting will be sampled, questioned, categorized, sliced, and diced a zillion different ways between now and Nov. 6, so if you\u2019re interested in gender polling by age bracket in Wirt County, W.Va., for the time being, you\u2019re in luck.<\/p>\n

\"David

David Rothschild<\/p><\/div>\n

So is David Rothschild (opens in new tab)<\/span><\/a>, an economist at Microsoft Research New York City (opens in new tab)<\/span><\/a>. Trained during an academic career that culminated with a Ph.D. in applied economics from the Wharton School of Business at the University of Pennsylvania, Rothschild is also an avid follower of the political scene.<\/p>\n

He has gained lots of renown this year for his work using prediction markets to harness big data in its many and varied forms to calculate and disseminate his prediction for who will be elected president. His research reflects Microsoft\u2019s deep expertise in machine learning to recognize complex patterns, make intelligent, data-based decisions, and open new avenues of exploration previously unattainable. This provides the foundation for techniques that promise to unlock the power of social-media data and to transform political-forecasting models.<\/p>\n

Pundits and talking heads flock to his posts on a pair of blogs, PredictWise (opens in new tab)<\/span><\/a> and The Signal. He is breathing the rarefied air encountered only when an individual moves from being an interested observer of the political process to becoming an influential participant in that arena.<\/p>\n

\u201cDavid\u2019s work offers a unique and innovative method for predicting election outcomes,\u201d says Sunshine Hillygus, associate professor of political science at Duke University. \u201cBy aggregating and correcting data from state-level polls and election markets, he produces a forecast that is far more useful than the simple national polling estimates that dominate media coverage.\u201d<\/p>\n

David Pennock (opens in new tab)<\/span><\/a>, assistant managing director of Microsoft Research New York City, couldn\u2019t agree more.<\/p>\n

\u201cDavid is incredible,\u201d Pennock says. \u201cI\u2019ve described him as a force of nature. The amount he can get done, his ideas and insights \u2026 you get excited just watching him. It\u2019s groundbreaking research.\u201d<\/p>\n

Rothschild joined Microsoft in May as a founding member of the New York City lab, and he has spent his time since then building prediction and sentiment models and organizing novel, experimental polling and prediction games. Indeed, his research centers on prediction markets, and on PredictWise, you can find his analyses of projected box-office receipts for upcoming movies, the state of the U.S. economic recovery, or the winner of this year\u2019s baseball World Series.<\/p>\n

His passions, though, run deepest in the political sphere. It\u2019s all a matter of navigating and analyzing massive amounts of data to uncover meaningful patterns or relationships that previously were hidden.<\/p>\n

\u201cAll these projects stem from a general research idea of thinking about all the data we have,\u201d he says. \u201cIt can be external, such as Facebook or Twitter: individual-level information people provide to the world. It can be internal: search or page views, things that people provide to Microsoft. And it can be things like polling and prediction markets, where people actively get more information to solve particular questions.<\/p>\n

Meaningful, Aggregated<\/h2>\n

\u201cHow do you combine all that data and turn it from raw data into meaningful, aggregated outcomes?\u201d<\/p>\n

The answer, Rothschild asserts, is to take two initial steps. The first is to use all of this data to make it efficient to create predictions, sentiment indexes, and interest indexes that match the needs of stakeholders. The second is to enable people to absorb the information, using data visualizations or other techniques to make it impactful.<\/p>\n

He already is putting this to work for Microsoft. Since August, he has been working with Xbox LIVE (opens in new tab)<\/span><\/a> to provide polling guidance for that service\u2019s Election 2012 hub, which enables members to interact in real time during the three presidential debates, the vice presidential debate, and in daily polling conducted with YouGov (opens in new tab)<\/span><\/a>. The polling is providing a snapshot of how Xbox LIVE\u2019s passionate, technically savvy user base is reacting to campaign developments.<\/p>\n

\u201cIf we\u2019re going to be polling people,\u201d Rothschild says, \u201ccan we learn anything from how people understand things and how efficient the data is to create polls and prediction markets that are even more effective at gathering the right information?\u201d<\/p>\n

That\u2019s the sort of reflection that is changing the game of understanding voters\u2019 intent\u2014an effort sorely in need of a fresh perspective. As Rothschild explains it, the science of political polling until recently was stuck in a rut three quarters of a century old.<\/p>\n

\u201cIn the mid-\u201920s up until the early \u201930s,\u201d he says, \u201cpeople got this idea to poll as many people as possible on who they would be voting for in the upcoming election. This, they thought, would provide some indication of what was going to happen.\u201d<\/p>\n

That technique worked for The Literary Digest<\/em>\u2014for a while. <\/em>In four consecutive U.S. presidential elections, from 1920 to 1932, its straw poll correctly predicted the winner. In 1936, though, things changed dramatically. In one of the classic pratfalls in U.S. political history, the magazine published its poll indicating that Alf Landon, Republican governor of Kansas, would be a big winner. On Election Day, incumbent Franklin Delano Roosevelt carried 46 states, Landon two. Shortly thereafter, its credibility in tatters, The Literary Digest<\/em> closed its doors for good.<\/p>\n

During the same election cycle, an upstart named George Gallup was able to predict Roosevelt as the winner by an astute use of representative samples of each state. His creation, the Gallup Poll, remains influential to this day.<\/p>\n

The Gold Standard<\/h2>\n

\u201cThat became the gold standard,\u201d Rothschild says. \u201cFor the next 75 years, the idea of the most efficient thing to do was to take a sample that represented likely voters or registered voters and report the raw data.\u201d<\/p>\n

Daily polls, though, are notoriously noisy and random, though aggregation of numbers from recent polls increases the accuracy significantly. In addition, the wording used in the polls was faulty: Who would you vote for if the election were held today? The problem is that the election almost never is held \u201ctoday.\u201d Known factors such as the anti-incumbent bias, in which incumbents poll more poorly around Labor Day than they do on Election Day, skew the numbers.<\/p>\n

\u201cEven if you correct for that,\u201d Rothschild notes, \u201cyou\u2019re creating an expected vote share, but most Americans don\u2019t really care about vote share. George W. Bush in 2000 had no less political capital to spend after his razor-thin election than Ronald Reagan after his \u201984 landslide. What people actually care about is who\u2019s going to win. That\u2019s the only thing that matters, and that\u2019s what we focus on when we gather data and create predictions.<\/p>\n

\u201cIt\u2019s amazing to me to think about how novel it is to think, \u2018Let\u2019s forecast the thing people actually care about.\u2019 That\u2019s one of the major things with which I\u2019m trying to approach all these things, thinking about what are the most efficient things I can create, and what does the end user really want and need? Can we create that? How close can we get to that?\u201d<\/p>\n

Accuracy is paramount, of course, but so is timing. Presidential forecasts, typically, are evaluated the night before the election. But, as Rothschild says, such forecasts are \u201cpretty darn worthless.\u201d What really has value is a forecast two months before an election.<\/p>\n

\u201cIt\u2019s the same thing with marketing-type questions,\u201d he explains. \u201cIt doesn\u2019t do much good if I can tell you which jeans are going to be popular the day before you put the jeans onto the market. You\u2019ve already produced those jeans. If I can forecast what type of jeans are going to be popular two months beforehand, then you can make the right investment strategy.<\/p>\n

Information When We Want It<\/h2>\n

\u201cI hope to get people to grade and judge people\u2019s forecasts and people\u2019s data streams when the people need it. We expect to have information all the time, when we want it. Five, 10 years from now, it\u2019s going to seem anachronistic that we thought about economic indicators on a monthly basis. I\u2019d be very surprised if very strong tracking on a minute-by-minute basis has not been developed.\u201d<\/p>\n

Pennock has seen that sort of nearly instantaneous data analysis play out in reality.<\/p>\n

\u201cThe great thing is that David is providing predictions in real time,\u201d he says. \u201cYou can see these reactions within minutes. After Rick Perry made his \u2018oops\u2019 mistake in that GOP debate, you could literally watch the predictions crash (opens in new tab)<\/span><\/a> in almost real time.\u201d<\/p>\n

Beyond the interest generated by working in such a high-profile area as presidential-election predictions, what are the research benefits of such work?<\/p>\n

\u201cNumber one is forecasting,\u201d Rothschild says. \u201cYou have a goal of creating the most accurate forecast at any given moment, because that will help create a more efficient world. Economists generally want to make a more efficient universe, and accurate forecasts on a regular basis help to do that.<\/p>\n

\u201cThe second goal is to understand the world. It\u2019s a research goal that is both beneficial to greater research as well as beneficial to decision-makers. It\u2019s understanding why things happen. Granular, correct, and efficient forecasts can help you understand the effect of a debate, the effect of a $10 million ad buy. You can see movement as things happen.\u201d<\/p>\n

To provide accurate forecasts and to gain a greater understanding of the world around us, Rothschild relies on data.<\/p>\n

\u201cYou want to be able to aggregate as much information as possible and create a prediction about what\u2019s going to happen,\u201d he says. \u201cWith prediction markets, you can get a self-selected group of people who have a lot more information than those in traditional polling. These are people who know a lot about elections. I got into this by thinking about polls versus prediction markets: What are we learning from these different things?\u201d<\/p>\n

Flocking to Xbox LIVE<\/h2>\n

The result of that musing led him to create hybrid approaches. That\u2019s what\u2019s happening on Xbox LIVE. Users of the service are not a perfect representation of the U.S. populace, but by asking unique questions and using new ways of combining that information, new ingredients are being added to the prognosis stew. It\u2019s certainly popular: As many as 10,000 people per day are participating in Xbox LIVE\u2019s daily polls.<\/p>\n

Back to that standard polling question: \u201cIf the election were held today \u2026\u201d It\u2019s static, it\u2019s easy, it\u2019s computationally trivial. And then there\u2019s the Rothschild approach.<\/p>\n

\u201cAsking people the probability something is going to happen\u2014\u2018What do you think is going to happen?\u2019\u2014is a lot trickier, because we don\u2019t have a track record of asking these questions, and they don\u2019t implicitly translate into anything very clean.\u201d<\/p>\n

Even so, such probabilistic probing has one key advantage: It works.<\/p>\n

\u201cBy asking somebody, \u2018Who do you think is going to win the election?\u2019 it touches on their intention, the intentions of their friends and family and those people they discuss elections with.<\/p>\n

\u201cWe found a sampling of 345 times where potential voters were asked who they were going to vote for, who they thought was going to win, and when those questions got different results. When the results were<\/em> different, more than half of the voters said they wanted candidate A to win, but that they expected candidate B to win. Seventy-five percent of the time, candidate B won.\u201d<\/p>\n

That\u2019s not all.<\/p>\n

\u201cAsking a person\u2019s expectations has a multiplicative effect,\u201d Rothschild states. \u201cIt\u2019s the equivalent of asking 10 random voters who they were going to vote for and reporting back a binary result of a poll of 10 people. \u201c<\/p>\n

That\u2019s not all.<\/p>\n

\u201cWe\u2019re able to show that even with an incredibly biased group of people,\u201d he adds, \u201cif you ask them their expectations, you can turn that into a meaningful forecast that something\u2019s going to happen.<\/p>\n

Lopsided Expectations, Accurate Forecasts<\/h2>\n

\u201cIf you just take those people who claim that they\u2019re going to vote for the Democratic candidate, or just those people who claim they\u2019re going to vote for the Republican candidate, by seeing how lopsided their expectations are for their candidate to win, you can make a very strong expectation of whether a candidate is going to win.\u201d<\/p>\n

The polling work Rothschild has done with Xbox LIVE has helped refine such techniques.<\/p>\n

\u201cWe have younger people and more males,\u201d he explains. \u201cOne of the ways that we\u2019re attacking that is by asking questions about people\u2019s social network. There are ways in which we can take a biased sample and have them give us information about a less-biased sample of people they may know.\u201d<\/p>\n

Such experimentation, of course, must be conducted with stringent privacy restrictions to protect individual users. For Rothschild, though, the goal is not so much to predict a particular election, even one as momentous as that for a U.S. president, as to gain knowledge about making future models more robust.<\/p>\n

\u201cNothing I do ever is calibrated with the 2012 election in mind,\u201d he says. \u201cNo model I\u2019ve ever created, no set of data I\u2019ve ever considered, do I consider it for how this works for the 2012 election. I do it to determine how this works in a total, historical view and how it works in a universal view.\u201d<\/p>\n

He\u2019s hoping he can extend his techniques to continuous forecasts for all 435 seats in the U.S. House of Representatives in 2014. He also wants to apply his exploration into the realm of economic indicators with the goal of providing accurate, meaningful predictions that shed more light on the underpinnings of the economy.<\/p>\n

For such potentially revolutionary research to have its best chance at success requires understanding and commitment. Microsoft, Rothschild says, is delivering those in spades.<\/p>\n

\u201cMicrosoft has made a very strong commitment to me and a few of us in New York,\u201d he says. \u201cMicrosoft understands that it\u2019s important for it to be seen as a leader in these fields. That allows us to produce better research.<\/p>\n

\u201cMicrosoft has afforded us the ability to sit back and think about this in the long run: What are the implications of this information? How do we utilize it? How do we make it more efficient? What are the next steps? It\u2019s very exciting.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"

By Rob Knies, Managing Editor, Microsoft Research It\u2019s a presidential election year in the United States, and that, we\u2019ve learned, means that pollsters are on the prowl. The electorate for the forthcoming balloting will be sampled, questioned, categorized, sliced, and diced a zillion different ways between now and Nov. 6, so if you\u2019re interested in […]<\/p>\n","protected":false},"author":39507,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"categories":[194474,194475,194479,194455],"tags":[201249,213536,213533,213524,213527,203341,203353,213521,213530,213518,187150],"research-area":[13556,13563,13548],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-303464","post","type-post","status-publish","format-standard","hentry","category-data-visulalization","category-database-data-analytics-platforms","category-economics","category-machine-learning","tag-david-rothschild","tag-forecasting","tag-gallup-poll","tag-political-forecasting-models","tag-predicting-election-outcomes","tag-prediction-markets","tag-predictwise","tag-social-media-data","tag-u-s-economic-recovery","tag-united-states-presidential-election","tag-xbox-live","msr-research-area-artificial-intelligence","msr-research-area-data-platform-analytics","msr-research-area-economics","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[199571],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-events":[],"related-researchers":[],"msr_type":"Post","byline":"","formattedDate":"September 26, 2012","formattedExcerpt":"By Rob Knies, Managing Editor, Microsoft Research It\u2019s a presidential election year in the United States, and that, we\u2019ve learned, means that pollsters are on the prowl. The electorate for the forthcoming balloting will be sampled, questioned, categorized, sliced, and diced a zillion different ways…","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/303464"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/39507"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=303464"}],"version-history":[{"count":6,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/303464\/revisions"}],"predecessor-version":[{"id":303509,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/303464\/revisions\/303509"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=303464"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=303464"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=303464"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=303464"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=303464"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=303464"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=303464"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=303464"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=303464"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=303464"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=303464"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}