Introduction Why Docs About

DY parser

Status: draft

This is an attempt to document and help you visualize the way the parser is implemented, so you can contribute to it by writing unit tests, improving algorithms or changing the syntax or behaviour.

The whole problem is: How can we convert this string at left written in the DY syntax, to the JSON representation at right that respects the database structure ?

In addition to exos, we would like to parse course and skills definitions too. Here is an example.

There is no direct path between text and JSON, the following strategy is applied.

Global strategy

Here is the global strategy described in the following sections. Based on prefixes, we can easily vertically cut and group the lines of a DY file. When can then extract the value of each section, depending on the type of the block (one line, multiple line, a list, ...). When we have this tree of blocks, we can browse the tree again to search for errors and convert each block to a single field or more. We can apply custom conversion for not trivially structured field in the final entities. Some errors can be searched against the final entity.

Now, let's see each step in details with concrete in and out...

DY syntax

How to use the DY syntax is already documented in user guide, but it's very important to first understand how it is defined technically.

Parsing engine vs syntax definitions
There is a clear separation between these 2 parts:

  1. The parsing engine: it implements the behaviour of the core principles that enable to have prefixes, keywords, blocks, blocks tree, value extraction, different value types, ... It defines the How parsing is done.. The how doesn't depend on the what is parsed. The engine doesn't care about which prefixes will be used at the end.
  2. The syntax definitions: they define the list of blocks that can be parsed, meaning a list of prefixes with some associated settings. It answers the "what to parse" question: a solution, an explanation, a description, a goal, ... You can find the details in folder common/parser in course.ts, skills.ts and exos.ts.

Block sections split

Splitting the text into block sections in a recursive algorithm, because we want to match the block of a prefix only at the correct level. I.e we want to match a Solution section only if it is inside a Exo section, so we will first cut Exo section (middle of the schema) and them recursively cut them into smaller sections to separate the Title itself of other sections like the solution and instruction (right of the schema).

The order at each level matters ! The options list must be provided after the instruction. In case the instruction is given after the options list, it will be ignored. The parser works by reading start of each line and if it finds a prefix, it takes on or more lines until another prefix is found or end of section is reached.

Blocks tree

This JSON representation of the tree shows that block all have a type attribute as defined in the syntax.