The main entry point is Session
class’s Learn()
method, which returns a Program
object. The Program
’s key method is Run()
that executes the program on an input Json to obtain the extracted output. Each program also has a Schema
property that defines the structure of the extracted data.
Other important methods are Serialize()
and Deserialize()
to serialize and deserialize Program
object.
To use Extraction.Json, one needs to reference:
Microsoft.ProgramSynthesis.Extraction.Json.dll
, Microsoft.ProgramSynthesis.Extraction.Json.Learner.dll
and Microsoft.ProgramSynthesis.Extraction.Json.Semantics.dll
.
The Sample Project (opens in new tab) illustrates our API usage.
Basic Usage
By default, Extraction.Json learns a join program in which inner arrays are joined with other fields. As a result, an outer object in the input Json can be flattened into several rows in the output table.
The below snippet illustrates a learning session to generate such program from the input jsonText
:
string jsonText = ... var session = new Session(); session.Inputs.Add(jsonText); Program program = session.Learn();
Clients may add NoJoinInnerArrays
constraint to the session to learn non-join
programs, as illustrated in the following snippet:
var noJoinSession = new Session(); session.Inputs.Add(jsonText); noJoinSession.Constraints.Add(new NoJoinInnerArrays()); Program noJoinProgram = noJoinSession.Learn();
Serializing/Deserializing a Program
The Extraction.Json.Program.Serialize()
method serializes the learned program to a string. The Extraction.Json.Loader.Instance.Load()
method deserializes the program text to a program.
// program was learned previously string progText = program.Serialize(); Program loadProg = Loader.Instance.Load(progText);
Executing a Program
Given an input Json, a program can generate a hierarchical tree or a flattened table. If the program is a join program, the table is flattened either using outer join (default) or inner join semantics.
Generating a Tree
Use this method to obtain a hierarchical tree of the input document.
// program was learned previously ITreeOutput tree = program.Run(jsonText);
Generating a Table
Supply the desired join semantics to the RunTable()
method as follows:
// program was learned previously IEnumerable outerJoinTable = program.RunTable(jsonText, TreeToTableSemantics.OuterJoin); IEnumerable innerJoinTable = program.RunTable(jsonText, TreeToTableSemantics.InnerJoin);