{"id":1146392,"date":"2025-08-04T02:31:57","date_gmt":"2025-08-04T09:31:57","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-blog-post&p=1146392"},"modified":"2025-10-09T19:21:34","modified_gmt":"2025-10-10T02:21:34","slug":"timecraft-a-universal-framework-for-time-series-generation","status":"publish","type":"msr-blog-post","link":"https:\/\/www.microsoft.com\/en-us\/research\/articles\/timecraft-a-universal-framework-for-time-series-generation\/","title":{"rendered":"TimeCraft: A universal framework for time-series generation"},"content":{"rendered":"\n
Time-series data\u2014measurements collected at regular intervals, like stock prices or traffic flows\u2014has become a key driver of intelligent decision-making systems across industries. From medical monitoring to financial risk control, identifying patterns in this data is essential to many important operations.<\/p>\n\n\n\n
At the same time, the creation of time-series data, or data synthesis, <\/em>is gaining momentum as organizations grapple with scarcity of real-world data, privacy protection, and the need to test a variety of different scenarios without exposing themselves to risk. AI-generated synthetic data simulates realistic patterns in a risk-free environment. It enables researchers to explore hypothetical scenarios and train models to make decisions in high-stakes contexts.<\/p>\n\n\n\n Yet many of these models fall short of what\u2019s needed. To be truly practical, a generator of time-series data must adapt across different industries and data patterns, offer precise control over trends and volatility and produce data that is realistic and reliable enough to support accurate modeling and analysis.<\/p>\n\n\n\n Microsoft Research Asia developed TimeCraft (opens in new tab)<\/span><\/a> to address this need. This open-source framework creates synthetic time-series data that can be used across different industries and scaled up for commercial applications. Users control data generation through simple written commands, and the system can adapt to different business needs, whether companies want to analyze existing patterns or create data for specific goals.<\/p>\n\n\n\n TimeCraft\u2019s user interface is build for flexibility. Users can guide date generation through three distinct methods:<\/p>\n\n\n\n These methods can be used independently or together, allowing users to generate data that aligns with specific goals, scenarios, or operational needs.<\/p>\n\n\n\n TimeCraft works across multiples industries\u2014where each type of time-series data follows distinct patterns\u2014with a unified approach built around semantic prototypes. These are shared representations of time-series structures serve as a universal vocabulary.<\/p>\n\n\n\n When users provide a few example time-series sequences from their specific industry, the Prototype Assignment Module (PAM) maps them to the prototype space, calculating optimal combinations. This industry-specific input guides the model to generate structurally aligned data, no labels or retraining needed.<\/p>\n\n\n\n The result is a system that can rapidly adapt to new scenarios in fields such as energy, healthcare, finance, and transportation, demonstrating strong structural transfer and generalization.<\/p>\n\n\n\n In many real-world scenarios, users know what kind of data they need but don\u2019t have access to enough relevant examples. A typical request might be: \u201cI want a time series that slowly rises for a few days, drops around day 10, and then fluctuates.\u201d These types of needs often arise in fields like healthcare and finance, where designing and testing systems with realistic data is essential but data access is limited.<\/p>\n\n\n\n TimeCraft makes it possible to generate this kind of tailored data using plain language. Instead of relying on specialized tools or existing datasets, users can simply describe the pattern they\u2019re looking for, and the system creates data that fits.<\/p>\n\n\n\n It does this using a collaborative training process involving multiple AI agents. It collects phrasing from real-world industry reports, fills in details using actual data statistics, and refines the wording until the descriptions match the data both clearly and accurately.<\/p>\n\n\n\n When a user submits a description, TimeCraft translates it into guidance for its generative model, enabling direct input, even from users without technical expertise. This makes the tool especially useful in situations where data is scarce or constantly changing. By bridging the user\u2019s intent with the model\u2019s capabilities, TimeCraft makes custom data generation as simple as writing a sentence. This process is illustrated in Figure 2.<\/p>\n\n\n\n Most generation models focus on producing realistic data. TimeCraft goes a step further, generating data that improves performance of downstream applications\u2014whether it\u2019s detecting disease trends or modeling market behavior.<\/p>\n\n\n\n This is possible thanks to TimeCraft\u2019s task-aware generation framework. Users can integrate their existing models directly into the data-creation process. The system then uses feedback from these models to guide the direction of data generation in real time, so the output isn\u2019t just realistic, it\u2019s useful.<\/p>\n\n\n\n At the core of this method is a technique called influence scoring<\/em>, which estimates how each piece of generated data affects a model\u2019s performance. TimeCraft uses these scores to guide the generation process, helping the system produce data with the greatest potential to improve results. This process is shown in Figure 3.<\/p>\n\n\n\n This approach is especially helpful in cases where certain patterns are rare or critically important. For instance, in medical diagnosis, TimeCraft can focus on generating a small set of patterns that meaningfully improve prediction accuracy.<\/p>\n\n\n\n By shifting the goal from simulating data to generating data that actively improves outcomes, TimeCraft turns synthetic data into a strategic tool.<\/p>\n\n\n\n TimeCraft was built for real-world applications. It accepts different types of input, adapts to complex use cases, and improves over time using feedback from the tasks it supports. Researchers at Microsoft Research Asia envision it as a comprehensive solution for industries where data is limited, expensive to collect, or sensitive to share\u2014making data generation more targeted, useful, and scalable.<\/p>\n\n\n\n Now open source (opens in new tab)<\/span><\/a>, TimeCraft is available for developers, researchers, and business partners around the world to explore, test, and build on.<\/p>\n\n\n\n Related research:<\/strong><\/p>\n\n\n\n Cross-domain generalization<\/strong><\/p>\n\n\n\n Controllability<\/strong><\/p>\n\n\n\n Task adaptability<\/strong><\/p>\n\n\n\n General techniques<\/strong><\/p>\n\n\n\n Financial applications<\/strong><\/p>\n\n\n\n Time-series data\u2014measurements collected at regular intervals, like stock prices or traffic flows\u2014has become a key driver of intelligent decision-making systems across industries. From medical monitoring to financial risk control, identifying patterns in this data is essential to many important operations. At the same time, the creation of time-series data, or data synthesis, is gaining momentum […]<\/p>\n","protected":false},"author":34512,"featured_media":1140307,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-content-parent":199560,"msr_hide_image_in_river":null,"footnotes":""},"research-area":[13556],"msr-locale":[268875],"msr-post-option":[269148,269142],"class_list":["post-1146392","msr-blog-post","type-msr-blog-post","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-post-option-approved-for-river","msr-post-option-include-in-river"],"msr_assoc_parent":{"id":199560,"type":"lab"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/1146392","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-blog-post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/34512"}],"version-history":[{"count":4,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/1146392\/revisions"}],"predecessor-version":[{"id":1151663,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/1146392\/revisions\/1151663"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1140307"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1146392"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1146392"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1146392"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1146392"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}Three ways to guide generation<\/h2>\n\n\n\n
\n

One model, many industries<\/h2>\n\n\n\n
Text-controlled generation: One sentence guides the model<\/h2>\n\n\n\n

Task-aware generation: Optimized for real-world impact<\/h2>\n\n\n\n

Built for real-world use, now open source<\/h2>\n\n\n\n
\n
\n
\n
\n
\n