jsg

json-grammar - a grammar-based validator for JSON structures

JSON Grammar, or JSG, is a language for describing the structure of JSON documents. It can be used for documentation, describing what a service or tool consumes or emits, and validation, testing conformance of some data to that description.

See a simple online demo.

Language

A JSG schema is composed of objects, rules and values. Objects are represented by a production name followed by a “{”, some named members or rule references, and “}”. These describe JSON objects like { "street":"Elm", "number":"123b" }. A member is composed of an attribute name, a “:”, and a type: { "street":NAME, "number":NUMBER }. A type can be a constant, value pattern, rule name, or a list of types:

By convention, value patterns labeled with ALL CAPS.

JSON Grammarmatching JSON
doc { status:"ready" }
  • { "status":"ready" }
doc { street:NAME no:NUM }
NAME : .*;
NUM : [0-9]+[a-e]?;
  • { "street":"Elm", "no":"1" }
  • { "street":"Elm", "no":"123" }
  • { "street":"Elm", "no":"123b" }
doc { street:(NAME|"*"|TEMPLATE) }
NAME : .*;
TEMPLATE : '{' .* '}';
  • { "street":"Elm" }
  • { "street":"*" }
  • { "street":"{mumble}" }
doc { street:nameOrTemplate }
nameOrTemplate = NAME | "*" | TEMPLATE
NAME : .*;
TEMPLATE : '{' .* '}';
  • { "street":"Elm" }
  • { "street":"*" }
  • { "street":"{mumble}" }
doc { street:[nameOrTemplate{2,} }
nameOrTemplate = NAME | "*" | TEMPLATE
NAME : .*;
TEMPLATE : '{' .* '}';
  • { "street":["Elm","X"] }
  • { "street":["*", "X", "{mumble}"] }

A schema can be composed with no rules but rule names can help with:

Values

Values are represented by a terminal name followed by a “:” and a regular pattern (c.f. lex) nad a “;”. They can reference each other (but not circularly) allowing a value to be composed of other values. The syntax is reminiscent of EBNF or W3C language specifications, e.g.:

Value patternmatching JSON
'@' START+ ('-' MIDCHAR+)*
START : [a-zA-Z];
MIDCHAR : START | [0-9];
NUM : [0-9]+[a-e]?;
  • @en
  • @en-US
  • @de-CH-1901

Code points in values can be specified by:

If we had a disdain for writing the letter ‘a’ and the symbol ‘@’, we could write the above value pattern as:

\u0040 START+ ('-' MIDCHAR+)*
START : [\u0061-z\u0041-Z];
MIDCHAR : START | [0-9];
NUM : [0-9]+[\u0061-e]?;

.Directives

| JSON Grammar | | JSON | — | — | — | | doc { a:STRING } STRING=".*" |passes| { "a":"hi" } | | doc { a:STRING } STRING=".*" |fails | { "type":"doc", "a":"hi" } | | .IGNORE type; doc { a:STRING } STRING=".*" |passes| { "type":"doc", "a":"hi" } | | doc { a:STRING, type:STRING } STRING=".*" |passes| { "type":"doc", "a":"hi" } | | .TYPE type; doc { a:STRING } STRING=".*" |passes| { "type":"doc", "a":"hi" } | | .TYPE type; doc { a:STRING } STRING=".*" |fails | { "type":"docXXX", "a":"hi" } | You can push the .TYPE property into each object if you want (and have to if it’s not universal). Error reports on schemas with a .TYPE directive tend to be terser as failing a discriminator check shortcuts the tests of all the other object properties.

Contributing

All PRs welcome. Please run tests first:

Testing

JSG has a set of built-in tests. It also tests JSON structures from the ShEx and SPARQL.js repositories. This presumes specific paths between where these are checked out. You can accomplish this by checking everything out in a directory, e.g. github:

mkdir github
cd github
git clone git@github.com:shexSpec/shexTest shexSpec/shexTest
git clone git@github.com:RubenVerborgh/SPARQL.js RubenVerborgh/SPARQL.js
# now to get JSG, initialize it and run the tests:
git clone git@github.com:ericprud/jsg ericprud/jsg
cd ericprud/jsg
npm install
npm run test-all

test/test.js has an easy way to enter passing and failing tests, e.g.

   ["ShExJ.jsg", "empty.json", true],
   ["ShExJ.jsg", "bad-noType.json", "type"],
   ["ShExJ.jsg", "bad-wrongType.json", false],

which tests that empty.json passes, bad-noType.json fails with an error mentioning “type” and bad-wrongType.json fails for some reason.