For In Grammar Breaking IdentifiersRegular vs Context Free Grammars“Arbitrary” context free grammars?transform grammar problemWhat is a Context Free Grammar?ANTRL simple grammar and Identifier Context-Free GrammarContext free grammarContext free grammar for an expressionAntlr4 grammar - trouble identifying grammarAtom Syntax Grammar Names
Would Brexit have gone ahead by now if Gina Miller had not forced the Government to involve Parliament?
What is the most important source of natural gas? coal, oil or other?
Approximate solution : factorial and exponentials
Tic-tac-toe for the terminal, written in C
Forward and backward integration -- cause of errors
How many chess players are over 2500 Elo?
Is CD audio quality good enough for the final delivery of music?
Does revoking a certificate result in revocation of its key?
How do you say “buy” in the sense of “believe”?
Logarithm of dependent variable is uniformly distributed. How to calculate a confidence interval for the mean?
Different circular sectors as new logo of the International System
What does the view outside my ship traveling at light speed look like?
Is it ok to put a subplot to a story that is never meant to contribute to the development of the main plot?
How do I align equations in three columns, justified right, center and left?
Where is the logic in castrating fighters?
How to respond to an upset student?
Where did Wilson state that the US would have to force access to markets with violence?
How strong are Wi-Fi signals?
At what point in European history could a government build a printing press given a basic description?
Can't remember the name of this game
Full backup on database creation
Crossing US border with music files I'm legally allowed to possess
Placing bypass capacitors after VCC reaches the IC
Which is the common name of Mind Flayers?
For In Grammar Breaking Identifiers
Regular vs Context Free Grammars“Arbitrary” context free grammars?transform grammar problemWhat is a Context Free Grammar?ANTRL simple grammar and Identifier Context-Free GrammarContext free grammarContext free grammar for an expressionAntlr4 grammar - trouble identifying grammarAtom Syntax Grammar Names
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
I'm trying to make a syntax highlighter in Atom for a toy language I'm working on. I'm at the stage where I'm defining the context free grammar. I've been building it up step by step and writing tests along the way. When I added the grammar for a for in loop it broke my test for parsing identifiers because the identifier started with "in". Here's the grammar as it stands now (Sorry for pasting so much code but I didn't know what might be relevant so I just added the whole thing):
module.exports = grammar(
name: 'MooLang',
rules:
source_file: $ => repeat($._declaration),
_declaration: $ => choice(
$.variable_declaration,
$._statement
),
variable_declaration: $ => seq(
choice('var', 'let'),
$.identifier,
optional(seq(
':', $._type
)),
optional(seq(
'=', $._expression
)),
$.eol
),
_statement: $ => choice(
$.for_statement,
$.expression_statement
),
for_statement: $ => prec(0, seq(
'for',
'(',
choice(
$.variable_declaration,
$.expression_statement,
),
'in',
$._expression,
')',
$._statement
)),
expression_statement: $ => prec(1, seq(
$._expression,
$.eol
)),
_expression: $ => choice(
$.assignment,
$.comparison_expression,
$.addition_expression,
$.multiplication_expression,
$.unary_expression,
prec(5, $.primary),
prec(-1, $._type) // TODO:(Casey) Remove this
),
assignment: $ => prec.right(0, seq(
$.identifier,
'=',
$._expression
)),
comparison_expression: $ => prec.left(1, seq(
$._expression,
choice('<', '<=', '>', '>=', '==', '!='),
$._expression
)),
addition_expression: $ => prec.left(2, seq(
$._expression,
choice('+', '-'),
$._expression
)),
multiplication_expression: $ => prec.left(3, seq(
$._expression,
choice('*', '/', '%'),
$._expression
)),
unary_expression: $=> prec.right(4, seq(
choice('!', '-'),
$.primary
)),
_type: $ => choice(
$.primitive_type,
$.list_type,
$.map_type
),
primitive_type: $ => choice(
'bool', 'string',
'int8', 'int16', 'int32', 'int64',
'uint8', 'uint16', 'uint32', 'uint64',
'float32', 'float64'
),
list_type: $ => seq(
'[',
$._type,
']'
),
map_type: $ => seq(
'',
$._type,
':',
$._type,
''
),
primary: $ => choice(
$.bool_literal,
$.list_literal,
$.map_literal,
$.parenthetical_expression,
$.identifier,
$.number
),
bool_literal: $ => choice('true', 'false'),
list_literal: $ => seq(
'[',
optional(seq(
$._expression,
repeat(seq(
',',
$._expression
)),
optional(','),
)),
']'
),
map_literal: $ => seq(
'',
optional(seq(
$._expression,
':',
$._expression,
repeat(seq(
',',
$._expression,
':',
$._expression,
)),
)),
''
),
parenthetical_expression: $ => seq(
'(',
$._expression,
')'
),
identifier: $ => prec(99, /[a-zA-Z_][a-zA-Z0-9_]*/),
number: $ => prec(99, /d+(_d+)*(.d+)?/),
eol: $ => 'n'
);
Here are the relevant tests:
==================
Identifier Tests
==================
13india
---
(source_file
(expression_statement (primary (number)) (MISSING))
(expression_statement (primary (identifier)) (eol))
)
==================
For Tests
==================
for (var a in people) a + 1
---
(source_file
(for_statement (variable_declaration (identifier)) (primary (identifier)) (expression_statement (addition_expression (primary (identifier)) (primary (number))) (eol)))
)
Until I added the grammar for the for loop all of the Identifier Tests were passing but now I get this output:

My guess is that it finds an unexpected 'd' because it thinks this is the 'in' keyword. But I can't figure out why it would think that since it doesn't match anything else about the for loop.
syntax-highlighting atom-editor context-free-grammar
add a comment |
I'm trying to make a syntax highlighter in Atom for a toy language I'm working on. I'm at the stage where I'm defining the context free grammar. I've been building it up step by step and writing tests along the way. When I added the grammar for a for in loop it broke my test for parsing identifiers because the identifier started with "in". Here's the grammar as it stands now (Sorry for pasting so much code but I didn't know what might be relevant so I just added the whole thing):
module.exports = grammar(
name: 'MooLang',
rules:
source_file: $ => repeat($._declaration),
_declaration: $ => choice(
$.variable_declaration,
$._statement
),
variable_declaration: $ => seq(
choice('var', 'let'),
$.identifier,
optional(seq(
':', $._type
)),
optional(seq(
'=', $._expression
)),
$.eol
),
_statement: $ => choice(
$.for_statement,
$.expression_statement
),
for_statement: $ => prec(0, seq(
'for',
'(',
choice(
$.variable_declaration,
$.expression_statement,
),
'in',
$._expression,
')',
$._statement
)),
expression_statement: $ => prec(1, seq(
$._expression,
$.eol
)),
_expression: $ => choice(
$.assignment,
$.comparison_expression,
$.addition_expression,
$.multiplication_expression,
$.unary_expression,
prec(5, $.primary),
prec(-1, $._type) // TODO:(Casey) Remove this
),
assignment: $ => prec.right(0, seq(
$.identifier,
'=',
$._expression
)),
comparison_expression: $ => prec.left(1, seq(
$._expression,
choice('<', '<=', '>', '>=', '==', '!='),
$._expression
)),
addition_expression: $ => prec.left(2, seq(
$._expression,
choice('+', '-'),
$._expression
)),
multiplication_expression: $ => prec.left(3, seq(
$._expression,
choice('*', '/', '%'),
$._expression
)),
unary_expression: $=> prec.right(4, seq(
choice('!', '-'),
$.primary
)),
_type: $ => choice(
$.primitive_type,
$.list_type,
$.map_type
),
primitive_type: $ => choice(
'bool', 'string',
'int8', 'int16', 'int32', 'int64',
'uint8', 'uint16', 'uint32', 'uint64',
'float32', 'float64'
),
list_type: $ => seq(
'[',
$._type,
']'
),
map_type: $ => seq(
'',
$._type,
':',
$._type,
''
),
primary: $ => choice(
$.bool_literal,
$.list_literal,
$.map_literal,
$.parenthetical_expression,
$.identifier,
$.number
),
bool_literal: $ => choice('true', 'false'),
list_literal: $ => seq(
'[',
optional(seq(
$._expression,
repeat(seq(
',',
$._expression
)),
optional(','),
)),
']'
),
map_literal: $ => seq(
'',
optional(seq(
$._expression,
':',
$._expression,
repeat(seq(
',',
$._expression,
':',
$._expression,
)),
)),
''
),
parenthetical_expression: $ => seq(
'(',
$._expression,
')'
),
identifier: $ => prec(99, /[a-zA-Z_][a-zA-Z0-9_]*/),
number: $ => prec(99, /d+(_d+)*(.d+)?/),
eol: $ => 'n'
);
Here are the relevant tests:
==================
Identifier Tests
==================
13india
---
(source_file
(expression_statement (primary (number)) (MISSING))
(expression_statement (primary (identifier)) (eol))
)
==================
For Tests
==================
for (var a in people) a + 1
---
(source_file
(for_statement (variable_declaration (identifier)) (primary (identifier)) (expression_statement (addition_expression (primary (identifier)) (primary (number))) (eol)))
)
Until I added the grammar for the for loop all of the Identifier Tests were passing but now I get this output:

My guess is that it finds an unexpected 'd' because it thinks this is the 'in' keyword. But I can't figure out why it would think that since it doesn't match anything else about the for loop.
syntax-highlighting atom-editor context-free-grammar
What language is this CFG defined in? Specifically what exactly does 'prec' mean? Also, '13india' is not an identifier according to the regex definition since it starts with a number. What is your test asserting about '13india'?
– Jerome Baek
Mar 24 at 8:30
The CFG is defined in tree-sitter which is what Atom uses to build its highlighters. Yeah, that test tested that identifiers couldn't start with a number. That's why the result has the (MISSING) in there.
– CaseyB
Mar 24 at 9:34
add a comment |
I'm trying to make a syntax highlighter in Atom for a toy language I'm working on. I'm at the stage where I'm defining the context free grammar. I've been building it up step by step and writing tests along the way. When I added the grammar for a for in loop it broke my test for parsing identifiers because the identifier started with "in". Here's the grammar as it stands now (Sorry for pasting so much code but I didn't know what might be relevant so I just added the whole thing):
module.exports = grammar(
name: 'MooLang',
rules:
source_file: $ => repeat($._declaration),
_declaration: $ => choice(
$.variable_declaration,
$._statement
),
variable_declaration: $ => seq(
choice('var', 'let'),
$.identifier,
optional(seq(
':', $._type
)),
optional(seq(
'=', $._expression
)),
$.eol
),
_statement: $ => choice(
$.for_statement,
$.expression_statement
),
for_statement: $ => prec(0, seq(
'for',
'(',
choice(
$.variable_declaration,
$.expression_statement,
),
'in',
$._expression,
')',
$._statement
)),
expression_statement: $ => prec(1, seq(
$._expression,
$.eol
)),
_expression: $ => choice(
$.assignment,
$.comparison_expression,
$.addition_expression,
$.multiplication_expression,
$.unary_expression,
prec(5, $.primary),
prec(-1, $._type) // TODO:(Casey) Remove this
),
assignment: $ => prec.right(0, seq(
$.identifier,
'=',
$._expression
)),
comparison_expression: $ => prec.left(1, seq(
$._expression,
choice('<', '<=', '>', '>=', '==', '!='),
$._expression
)),
addition_expression: $ => prec.left(2, seq(
$._expression,
choice('+', '-'),
$._expression
)),
multiplication_expression: $ => prec.left(3, seq(
$._expression,
choice('*', '/', '%'),
$._expression
)),
unary_expression: $=> prec.right(4, seq(
choice('!', '-'),
$.primary
)),
_type: $ => choice(
$.primitive_type,
$.list_type,
$.map_type
),
primitive_type: $ => choice(
'bool', 'string',
'int8', 'int16', 'int32', 'int64',
'uint8', 'uint16', 'uint32', 'uint64',
'float32', 'float64'
),
list_type: $ => seq(
'[',
$._type,
']'
),
map_type: $ => seq(
'',
$._type,
':',
$._type,
''
),
primary: $ => choice(
$.bool_literal,
$.list_literal,
$.map_literal,
$.parenthetical_expression,
$.identifier,
$.number
),
bool_literal: $ => choice('true', 'false'),
list_literal: $ => seq(
'[',
optional(seq(
$._expression,
repeat(seq(
',',
$._expression
)),
optional(','),
)),
']'
),
map_literal: $ => seq(
'',
optional(seq(
$._expression,
':',
$._expression,
repeat(seq(
',',
$._expression,
':',
$._expression,
)),
)),
''
),
parenthetical_expression: $ => seq(
'(',
$._expression,
')'
),
identifier: $ => prec(99, /[a-zA-Z_][a-zA-Z0-9_]*/),
number: $ => prec(99, /d+(_d+)*(.d+)?/),
eol: $ => 'n'
);
Here are the relevant tests:
==================
Identifier Tests
==================
13india
---
(source_file
(expression_statement (primary (number)) (MISSING))
(expression_statement (primary (identifier)) (eol))
)
==================
For Tests
==================
for (var a in people) a + 1
---
(source_file
(for_statement (variable_declaration (identifier)) (primary (identifier)) (expression_statement (addition_expression (primary (identifier)) (primary (number))) (eol)))
)
Until I added the grammar for the for loop all of the Identifier Tests were passing but now I get this output:

My guess is that it finds an unexpected 'd' because it thinks this is the 'in' keyword. But I can't figure out why it would think that since it doesn't match anything else about the for loop.
syntax-highlighting atom-editor context-free-grammar
I'm trying to make a syntax highlighter in Atom for a toy language I'm working on. I'm at the stage where I'm defining the context free grammar. I've been building it up step by step and writing tests along the way. When I added the grammar for a for in loop it broke my test for parsing identifiers because the identifier started with "in". Here's the grammar as it stands now (Sorry for pasting so much code but I didn't know what might be relevant so I just added the whole thing):
module.exports = grammar(
name: 'MooLang',
rules:
source_file: $ => repeat($._declaration),
_declaration: $ => choice(
$.variable_declaration,
$._statement
),
variable_declaration: $ => seq(
choice('var', 'let'),
$.identifier,
optional(seq(
':', $._type
)),
optional(seq(
'=', $._expression
)),
$.eol
),
_statement: $ => choice(
$.for_statement,
$.expression_statement
),
for_statement: $ => prec(0, seq(
'for',
'(',
choice(
$.variable_declaration,
$.expression_statement,
),
'in',
$._expression,
')',
$._statement
)),
expression_statement: $ => prec(1, seq(
$._expression,
$.eol
)),
_expression: $ => choice(
$.assignment,
$.comparison_expression,
$.addition_expression,
$.multiplication_expression,
$.unary_expression,
prec(5, $.primary),
prec(-1, $._type) // TODO:(Casey) Remove this
),
assignment: $ => prec.right(0, seq(
$.identifier,
'=',
$._expression
)),
comparison_expression: $ => prec.left(1, seq(
$._expression,
choice('<', '<=', '>', '>=', '==', '!='),
$._expression
)),
addition_expression: $ => prec.left(2, seq(
$._expression,
choice('+', '-'),
$._expression
)),
multiplication_expression: $ => prec.left(3, seq(
$._expression,
choice('*', '/', '%'),
$._expression
)),
unary_expression: $=> prec.right(4, seq(
choice('!', '-'),
$.primary
)),
_type: $ => choice(
$.primitive_type,
$.list_type,
$.map_type
),
primitive_type: $ => choice(
'bool', 'string',
'int8', 'int16', 'int32', 'int64',
'uint8', 'uint16', 'uint32', 'uint64',
'float32', 'float64'
),
list_type: $ => seq(
'[',
$._type,
']'
),
map_type: $ => seq(
'',
$._type,
':',
$._type,
''
),
primary: $ => choice(
$.bool_literal,
$.list_literal,
$.map_literal,
$.parenthetical_expression,
$.identifier,
$.number
),
bool_literal: $ => choice('true', 'false'),
list_literal: $ => seq(
'[',
optional(seq(
$._expression,
repeat(seq(
',',
$._expression
)),
optional(','),
)),
']'
),
map_literal: $ => seq(
'',
optional(seq(
$._expression,
':',
$._expression,
repeat(seq(
',',
$._expression,
':',
$._expression,
)),
)),
''
),
parenthetical_expression: $ => seq(
'(',
$._expression,
')'
),
identifier: $ => prec(99, /[a-zA-Z_][a-zA-Z0-9_]*/),
number: $ => prec(99, /d+(_d+)*(.d+)?/),
eol: $ => 'n'
);
Here are the relevant tests:
==================
Identifier Tests
==================
13india
---
(source_file
(expression_statement (primary (number)) (MISSING))
(expression_statement (primary (identifier)) (eol))
)
==================
For Tests
==================
for (var a in people) a + 1
---
(source_file
(for_statement (variable_declaration (identifier)) (primary (identifier)) (expression_statement (addition_expression (primary (identifier)) (primary (number))) (eol)))
)
Until I added the grammar for the for loop all of the Identifier Tests were passing but now I get this output:

My guess is that it finds an unexpected 'd' because it thinks this is the 'in' keyword. But I can't figure out why it would think that since it doesn't match anything else about the for loop.
syntax-highlighting atom-editor context-free-grammar
syntax-highlighting atom-editor context-free-grammar
edited Mar 24 at 7:29
marc_s
591k13311301278
591k13311301278
asked Mar 24 at 6:41
CaseyBCaseyB
21.6k966104
21.6k966104
What language is this CFG defined in? Specifically what exactly does 'prec' mean? Also, '13india' is not an identifier according to the regex definition since it starts with a number. What is your test asserting about '13india'?
– Jerome Baek
Mar 24 at 8:30
The CFG is defined in tree-sitter which is what Atom uses to build its highlighters. Yeah, that test tested that identifiers couldn't start with a number. That's why the result has the (MISSING) in there.
– CaseyB
Mar 24 at 9:34
add a comment |
What language is this CFG defined in? Specifically what exactly does 'prec' mean? Also, '13india' is not an identifier according to the regex definition since it starts with a number. What is your test asserting about '13india'?
– Jerome Baek
Mar 24 at 8:30
The CFG is defined in tree-sitter which is what Atom uses to build its highlighters. Yeah, that test tested that identifiers couldn't start with a number. That's why the result has the (MISSING) in there.
– CaseyB
Mar 24 at 9:34
What language is this CFG defined in? Specifically what exactly does 'prec' mean? Also, '13india' is not an identifier according to the regex definition since it starts with a number. What is your test asserting about '13india'?
– Jerome Baek
Mar 24 at 8:30
What language is this CFG defined in? Specifically what exactly does 'prec' mean? Also, '13india' is not an identifier according to the regex definition since it starts with a number. What is your test asserting about '13india'?
– Jerome Baek
Mar 24 at 8:30
The CFG is defined in tree-sitter which is what Atom uses to build its highlighters. Yeah, that test tested that identifiers couldn't start with a number. That's why the result has the (MISSING) in there.
– CaseyB
Mar 24 at 9:34
The CFG is defined in tree-sitter which is what Atom uses to build its highlighters. Yeah, that test tested that identifiers couldn't start with a number. That's why the result has the (MISSING) in there.
– CaseyB
Mar 24 at 9:34
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55321339%2ffor-in-grammar-breaking-identifiers%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55321339%2ffor-in-grammar-breaking-identifiers%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What language is this CFG defined in? Specifically what exactly does 'prec' mean? Also, '13india' is not an identifier according to the regex definition since it starts with a number. What is your test asserting about '13india'?
– Jerome Baek
Mar 24 at 8:30
The CFG is defined in tree-sitter which is what Atom uses to build its highlighters. Yeah, that test tested that identifiers couldn't start with a number. That's why the result has the (MISSING) in there.
– CaseyB
Mar 24 at 9:34