{"id":171459,"date":"2015-04-28T11:12:37","date_gmt":"2015-04-28T11:12:37","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/project\/platform-for-interactive-concept-learning-picl\/"},"modified":"2018-12-03T15:04:35","modified_gmt":"2018-12-03T23:04:35","slug":"platform-for-interactive-concept-learning-picl","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/platform-for-interactive-concept-learning-picl\/","title":{"rendered":"Platform for Interactive Concept Learning (PICL)"},"content":{"rendered":"
Quick interaction between a human teacher and a learning machine presents numerous benefits and challenges when working with web-scale data. The human teacher guides the machine towards accomplishing the task of interest. The system leverages big data to find examples that maximize the training value of its interaction with the teacher.<\/p>\n
Building classifiers and entity extractors is currently an inefficient process involving machine learning experts, developers and labelers. PICL enables teachers with no expertise in machine learning to build classifiers and entity extractors. PICL’s user interface reflects this objective and allows a few key actions that do not require ML or engineering skills. The teachers using\u00a0PICL can (1) search or sample items to label, (2) label these items, (3) select and edit features, (4) monitor accuracy and (5) review errors.<\/p>\n
Training, scoring and regularizing are not teacher actions on PICL. Rather, these computations happen implicitly and transparently. Training and scoring starts automatically after modifying features or providing labels. The teachers are always aware of the state of the system as a status bar indicates which actions are not yet reflected in the current model.<\/p>\n
When teachers start building models in PICL, they select at least one initial feature and they search for some seed positive and negative items via a text query. They can then label data resulting from the search and submit these labels. From this point,\u00a0PICL automatically trains a model and starts making predictions on new data, i.e. producing scores. The teachers can then sample data that are deemed useful to improve the model (active learning), or keep searching the dataset. If a model is available (i.e., after the cold-start period),\u00a0PICL pre-labels the examples shown to the teacher with the current model\u2019s most-likely prediction. As a result, the teacher can label efficiently by simply correcting those pre-labels that are not correct. Moreover, the process of explicitly correcting the model helps the teacher understand the weaknesses of the current model.<\/p>\n
Teachers can also supply features to PICL. The teacher can either browse a corpus of existing features or create a new feature from scratch (e.g. dictionaries). Active features that represent the information that the model currently \u2018sees\u2019 are always visible in the interface.<\/p>\n
At any point in time, teachers can evaluate their models:\u00a0PICL splits the labeled data into a training and test set so that it can compute and display performance metrics, including estimates of the generalization performance. Teachers are therefore empowered to label, feature, review, debug and search. They can understand the performance of the model they produce. When they feel confident about their model, it can be exported for deployment.<\/p>\n
<\/p>\n