{"id":302381,"date":"2013-02-12T16:09:41","date_gmt":"2013-02-13T00:09:41","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=302381"},"modified":"2016-10-12T10:56:41","modified_gmt":"2016-10-12T17:56:41","slug":"flash-fill-gives-excel-smart-charge","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/flash-fill-gives-excel-smart-charge\/","title":{"rendered":"Flash Fill Gives Excel a Smart Charge"},"content":{"rendered":"
By Douglas Gantenbein, Senior Writer, Microsoft News Center<\/em><\/p>\n In December 2009, Sumit Gulwani, a senior researcher at Microsoft Research Redmond<\/a>, was flying home from a seminar after presenting his work devising ways to synthesize complex pieces of code<\/a>.<\/p>\n Sitting next to him was a businesswoman. She was delighted to learn that Gulwani was both a Ph.D. in computer science and a Microsoft employee. She popped open her laptop and asked, \u201cIs there a way to merge two columns in Excel<\/a>, when one column has a first name, the other the last, so that a column has both first and last names?\u201d<\/p>\n Gulwani paused.<\/p>\n \u201cI had to admit to her that I knew next to nothing about Excel,\u201d Gulwani recalls. \u201cBut when I got home, I opened my own laptop and looked for Excel search words. I found hundreds of help topics. And that made me think: Maybe this is the next problem I should solve.\u201d<\/p>\n Solve it he did. In a collaborative effort that spanned both Microsoft Research and the Office 2013<\/a> team, Microsoft has introduced a feature in Excel 2013 called Flash Fill<\/a>. Flash Fill earned enthusiastic reviews even before Office 2013 was released Jan. 29, and it is sure to be popular with users. Flash Fill takes tedious tasks now often performed manually in Excel\u2014or by writing one-off programs\u2014and executes them automatically. And it does so simply, requiring only the typing in an example of what the user wants.<\/p>\n Gulwani credits an interdisciplinary approach to helping make Flash Fill possible. It relies on his expertise in program verification and program synthesis, an approach to generating small pieces of code based on a user\u2019s needs. Machine learning, a strong focus of Microsoft Research, provides the ability to rank the most suitable programs Flash Fill uses in a task. And contributions from human-computer-interaction studies underpin Flash Fill\u2019s effort to avoid the discoverability issue associated with new features by automatically invoking the synthesis engine after the user provides indication of a repetitive task.<\/p>\n How does Flash Fill work? Let\u2019s say an Excel user has a column of Social Security numbers. But they\u2019re not formatted correctly\u2013they\u2019re 123456789, not 123-45-6789. An Excel user simply creates a new column adjacent to the existing column, then types in the correct example: 123-45-6789. Flash Fill immediately fills in all rows below the example with properly formatted figures. The user needs just to click to accept them all.<\/p>\n Flash Fill works for dozens of tasks: removing or adding formatting to names or other word groups, correcting inconsistencies in punctuation or the use of symbols such as brackets, converting dates of birth from something such as 8-4-1954 to 8\/4\/54, and many others.<\/p>\n \u201cFlash Fill brings the power of text manipulation to the hands of common, non-technical, everyday users,\u201d says Chad Rothschiller, a program manager with the Excel 2013 team who worked closely with Gulwani. \u201cEven people who aren\u2019t familiar with Excel can use the feature. It takes complex functionality that someone would otherwise have to build in formulas or code and puts it in the hands of common users.\u201d<\/p>\n Flash Fill performs these chores in a way that even a newcomer to Excel can appreciate. That\u2019s a big improvement over the previous state of things, in which a user had to know programming to create tools that can automate everyday Excel jobs.<\/p>\n But Flash Fill\u2019s big leap is in how it performs Excel tasks. Computer scientists have long known that a software program can be taught certain tasks through a process called \u201clearning by demonstration.\u201d But that requires the software user to show the software exactly how to do something.<\/p>\n Gulwani and his team developed Flash Fill to learn by example, not demonstration. A user simply shows Flash Fill what he or she wants to do by filling in an Excel cell with the desired result, and Flash Fill quickly invokes an underlying program that can perform the task.<\/p>\n It\u2019s the difference between teaching someone how to make a pizza step by step and simply showing them a picture of a pizza and minutes later eating a hot pie.<\/p>\n But that simplicity comes with a price.<\/p>\n \u201cThe biggest challenge,\u201d Gulwani says, \u201cis that learning by example is not always a precise description of the user\u2019s intent\u2014there is a lot of ambiguity involved.<\/p>\n \u201cTake the example of Rick Rashid<\/a> [Microsoft Research\u2019s chief research officer]. Let\u2019s say you want to convert Rick Rashid to Rashid, R. Where does that \u2018R\u2019 come from? Is it the \u2018R\u2019 of Rick or the \u2018R\u2019 of Rashid? It\u2019s very hard for a program to understand.\u201d<\/p>\n For each situation, Flash Fill synthesizes millions of small programs\u201410-20 lines of code\u2014that might accomplish the task. It sounds implausible, but Gulwani\u2019s deep research background in synthesizing code makes it possible. Then, using machine-learning techniques, Flash Fill sorts through these programs to find the one best-suited for the job.<\/p>\n Gulwani and his team designed Flash Fill to generalize user examples into a small program, then to run that program on the remaining spreadsheet rows to populate the output column.<\/p>\n The underlying technology is based on recent advances in automated program generation from examples. Since the number of programs that are consistent with a small number of examples is so often huge, Flash Fill uses efficient data structures and algorithms to represent and compute these programs. It then uses machine-learning-based ranking techniques to identify the most likely program from this set.<\/p>\n But despite the potential of Flash Fill, Gulwani found himself facing a hard sales pitch when he first suggested its inclusion in Office 2013. Early in 2010, he gave a group of Office 2013 project managers a talk that showed how Flash Fill, then called QuickCode, could work but did not include actual demos.<\/p>\n \u201cI am giving the presentation,\u201d he recalls, \u201cand they stop me in the middle of it and say: \u2018Sumit, you don\u2019t need to convince us this is practically useful. But we don\u2019t believe you can do this. It looks too much like magic.\u2019\u201d<\/p>\n To show that his idea wasn\u2019t \u201cmagic,\u201d So Gulwani spent the next three weeks programming code so he could demonstrate Flash Fill. That caught the attention of the Office 2013 team, and work began on incorporating the new feature.<\/p>\n Rothschiller recalls his first impressions of the feature:<\/p>\n \u201cWhenever you see something for the first time,\u201d he says, \u201cit takes a while to sink in\u2014especially if the technology is complex or powerful. The general idea was really interesting, but we had some questions. Could it be broadened to cover a few more of the basic scenarios? Can we put this in the hands of basic users? And how much developer time would it take to get the code working inside the Office codebase?<\/p>\n \u201cThe hard thing about complex technology is making it simple enough. So the initial demos went really deep, but everything took several clicks and lots of setup to get things to work. The most compelling part of the underlying technology is that everyone can look at data and provide some examples of the output they\u2019d like to see. It\u2019s way harder to encode that into a programming language, script, or formula.<\/p>\n Now, Rothschiller says, \u201cFlash Fill brings the power of text manipulation to the hands of common, non-technical, everyday users. Even people who aren\u2019t familiar with Excel can use the feature.\u201d<\/p>\n Rothschiller took a lead role in designing the new feature for ease of use. And both the Office team and Gulwani\u2019s team worked to port the new feature\u2019s C# programming language into Office\u2019s native C++. Gulwani, together with intern Rishabh Singh<\/a> and Dany Rouhana, a research software-design engineer, worked closely with Rothschiller, as well as Weide Zhong, a senior software engineer on the Office team, to integrate Flash Fill with Excel 2013.<\/p>\n The Office team also worked to design the research code to make it compatible with Office and more user-friendly. Gulwani, Singh, and Zhong, meanwhile, worked to give Flash Fill the intelligence to understand, in most cases, a user\u2019s intent with just one example, and almost never more than two or three. Their goals were to ensure that Flash Fill was as fast as possible, so users would not have to wait for it to perform a task.<\/p>\n Reviews of Flash Fill from technical journalists who reviewed the release-to-manufacturing version of Excel 2013 have heaped praise on Flash Fill.<\/p>\n \u201cSimply type your intended target text in the next field,\u201d wrote Jonathan Bray in PC Pro<\/em><\/a>, \u201chit the Flash Fill button on the Data Ribbon, and Excel will complete the job for you, splitting out the text all down the appropriate column. Genius.\u201d<\/p>\n Added Ars Technica<\/em><\/a>: \u201cOne of the shock-and-awe features of the Excel 2013 demo was the “flash fill” feature. This is a piece of background logic that watches what you’re putting into cells in a column and looks for patterns. If it recognizes the data as being extracted from text in an adjacent cell, it will automatically “ghost” the rest of the matching content into the column as you type.\u201d<\/p>\nInterdisciplinary Approach<\/h2>\n
Sales Pitch<\/h2>\n
As Fast as Possible<\/h2>\n