rewriter.coffee

跳转到 … +

browser.coffee cake.coffee coffee-script.coffee command.coffee grammar.coffee helpers.coffee index.coffee lexer.coffee nodes.coffee optparse.coffee register.coffee repl.coffee rewriter.coffee scope.litcoffee sourcemap.litcoffee

rewriter.coffee
¶

CoffeeScript 语言包含大量的可选语法、隐式语法和简写语法。这会极大地复杂化语法并膨胀生成的解析表。我们没有让解析器处理所有这些，而是对标记流进行一系列的处理，使用这个 **Rewriter** 将简写转换为明确的长格式，添加隐式缩进和括号，并进行一般性的清理。

创建一个生成的标记：一个由于使用隐式语法而存在的标记。

generate = (tag, value, origin) ->
  tok = [tag, value]
  tok.generated = yes
  tok.origin = origin if origin
  tok

¶

**Rewriter** 类由 Lexer 使用，直接针对其内部的标记数组。
```
exports.Rewriter = class Rewriter
```
¶

通过多个处理步骤重写标记流，每次处理一个逻辑过滤器。这当然可以更改为对流进行一次处理，使用一个大型高效的 switch 语句，但这样处理起来要好得多。这些处理步骤的顺序很重要 - 必须在对代码块进行隐式括号包装之前纠正缩进。
```
  rewrite: (@tokens) ->
```

用于调试的有用代码段：console.log (t[0] + ‘/‘ + t[1] for t in @tokens).join ‘ ‘

    @removeLeadingNewlines()
    @closeOpenCalls()
    @closeOpenIndexes()
    @normalizeLines()
    @tagPostfixConditionals()
    @addImplicitBracesAndParens()
    @addLocationDataToGeneratedTokens()
    @fixOutdentLocationData()
    @tokens

重写标记流，向前和向后查看一个标记。允许块的返回值告诉我们向前（或向后）移动流中多少个标记，以确保在插入和删除标记时不会错过任何标记，并且流在我们的操作下长度发生变化。

  scanTokens: (block) ->
    {tokens} = this
    i = 0
    i += block.call this, token, i, tokens while token = tokens[i]
    true

  detectEnd: (i, condition, action) ->
    {tokens} = this
    levels = 0
    while token = tokens[i]
      return action.call this, token, i     if levels is 0 and condition.call this, token, i
      return action.call this, token, i - 1 if not token or levels < 0
      if token[0] in EXPRESSION_START
        levels += 1
      else if token[0] in EXPRESSION_END
        levels -= 1
      i += 1
    i - 1

前导换行符会在语法中引入歧义，因此我们在这里进行处理。

  removeLeadingNewlines: ->
    break for [tag], i in @tokens when tag isnt 'TERMINATOR'
    @tokens.splice 0, i if i

词法分析器已标记方法调用的左括号。将其与配对的右括号匹配。我们在这里包含了错误嵌套的缩进情况，用于在同一行上关闭的调用，就在其缩进之前。

  closeOpenCalls: ->
    condition = (token, i) ->
      token[0] in [')', 'CALL_END'] or
      token[0] is 'OUTDENT' and @tag(i - 1) is ')'

    action = (token, i) ->
      @tokens[if token[0] is 'OUTDENT' then i - 1 else i][0] = 'CALL_END'

    @scanTokens (token, i) ->
      @detectEnd i + 1, condition, action if token[0] is 'CALL_START'
      1

词法分析器已标记索引操作调用的左括号。将其与配对的右括号匹配。

  closeOpenIndexes: ->
    condition = (token, i) ->
      token[0] in [']', 'INDEX_END']

    action = (token, i) ->
      token[0] = 'INDEX_END'

    @scanTokens (token, i) ->
      @detectEnd i + 1, condition, action if token[0] is 'INDEX_START'
      1

匹配从 i 开始的标记流中的标记，并跳过 'HERECOMMENT'。pattern 可以包含字符串（相等）、字符串数组（其中之一）或 null（通配符）。返回匹配项的索引，如果未匹配则返回 -1。

  indexOfTag: (i, pattern...) ->
    fuzz = 0
    for j in [0 ... pattern.length]
      fuzz += 2 while @tag(i + j + fuzz) is 'HERECOMMENT'
      continue if not pattern[j]?
      pattern[j] = [pattern[j]] if typeof pattern[j] is 'string'
      return -1 if @tag(i + j + fuzz) not in pattern[j]
    i + j + fuzz - 1

如果站在类似于 @<x>:、<x>: 或 <EXPRESSION_START><x>...<EXPRESSION_END>: 的内容前面，则返回 yes，跳过 'HERECOMMENT'。

  looksObjectish: (j) ->
    return yes if @indexOfTag(j, '@', null, ':') > -1 or @indexOfTag(j, null, ':') > -1
    index = @indexOfTag(j, EXPRESSION_START)
    if index > -1
      end = null
      @detectEnd index + 1, ((token) -> token[0] in EXPRESSION_END), ((token, i) -> end = i)
      return yes if @tag(end + 1) is ':'
    no

如果当前行的标记包含相同表达式级别的标记元素，则返回 yes。在 LINEBREAKS 或包含的平衡表达式的显式开始处停止搜索。

  findTagsBackwards: (i, tags) ->
    backStack = []
    while i >= 0 and (backStack.length or
          @tag(i) not in tags and
          (@tag(i) not in EXPRESSION_START or @tokens[i].generated) and
          @tag(i) not in LINEBREAKS)
      backStack.push @tag(i) if @tag(i) in EXPRESSION_END
      backStack.pop() if @tag(i) in EXPRESSION_START and backStack.length
      i -= 1
    @tag(i) in tags

¶

在标记流中查找隐式调用和对象的迹象，并添加它们。
```
  addImplicitBracesAndParens: ->
```

在堆栈上跟踪当前的平衡深度（隐式和显式）。

    stack = []
    start = null

    @scanTokens (token, i, tokens) ->
      [tag]     = token
      [prevTag] = prevToken = if i > 0 then tokens[i - 1] else []
      [nextTag] = if i < tokens.length - 1 then tokens[i + 1] else []
      stackTop  = -> stack[stack.length - 1]
      startIdx  = i

¶

辅助函数，用于在返回以获取新标记时跟踪消耗和拼接的标记数量。
```
      forward   = (n) -> i - startIdx + n
```

辅助函数

      isImplicit        = (stackItem) -> stackItem?[2]?.ours
      isImplicitObject  = (stackItem) -> isImplicit(stackItem) and stackItem?[0] is '{'
      isImplicitCall    = (stackItem) -> isImplicit(stackItem) and stackItem?[0] is '('
      inImplicit        = -> isImplicit stackTop()
      inImplicitCall    = -> isImplicitCall stackTop()
      inImplicitObject  = -> isImplicitObject stackTop()

隐式括号内的未闭合控制语句（如类声明或 if 条件语句）

      inImplicitControl = -> inImplicit and stackTop()?[0] is 'CONTROL'

      startImplicitCall = (j) ->
        idx = j ? i
        stack.push ['(', idx, ours: yes]
        tokens.splice idx, 0, generate 'CALL_START', '(', ['', 'implicit function call', token[2]]
        i += 1 if not j?

      endImplicitCall = ->
        stack.pop()
        tokens.splice i, 0, generate 'CALL_END', ')', ['', 'end of input', token[2]]
        i += 1

      startImplicitObject = (j, startsLine = yes) ->
        idx = j ? i
        stack.push ['{', idx, sameLine: yes, startsLine: startsLine, ours: yes]
        val = new String '{'
        val.generated = yes
        tokens.splice idx, 0, generate '{', val, token
        i += 1 if not j?

      endImplicitObject = (j) ->
        j = j ? i
        stack.pop()
        tokens.splice j, 0, generate '}', '}', token
        i += 1

如果参数中包含以下任何内容，则不要在下一个缩进处结束隐式调用

      if inImplicitCall() and tag in ['IF', 'TRY', 'FINALLY', 'CATCH',
        'CLASS', 'SWITCH']
        stack.push ['CONTROL', i, ours: yes]
        return forward(1)

      if tag is 'INDENT' and inImplicit()

INDENT 会关闭隐式调用，除非

我们在该行上看到了 CONTROL 参数。
缩进之前的最后一个标记是以下列表的一部分

        if prevTag not in ['=>', '->', '[', '(', ',', '{', 'TRY', 'ELSE', '=']
          endImplicitCall() while inImplicitCall()
        stack.pop() if inImplicitControl()
        stack.push [tag, i]
        return forward(1)

显式表达式的直接开始

      if tag in EXPRESSION_START
        stack.push [tag, i]
        return forward(1)

关闭显式闭合表达式内的所有隐式表达式。

      if tag in EXPRESSION_END
        while inImplicit()
          if inImplicitCall()
            endImplicitCall()
          else if inImplicitObject()
            endImplicitObject()
          else
            stack.pop()
        start = stack.pop()

识别标准的隐式调用，如 f a、f() b、f? c、h[0] d 等。

      if (tag in IMPLICIT_FUNC and token.spaced or
          tag is '?' and i > 0 and not tokens[i - 1].spaced) and
         (nextTag in IMPLICIT_CALL or
          nextTag in IMPLICIT_UNSPACED_CALL and
          not tokens[i + 1]?.spaced and not tokens[i + 1]?.newLine)
        tag = token[0] = 'FUNC_EXIST' if tag is '?'
        startImplicitCall i + 1
        return forward(2)

隐式调用，以隐式缩进的对象作为第一个参数。

f
  a: b
  c: d

以及

f
  1
  a: b
  b: c

当在以下控制结构的同一行上时，不要接受此类型的隐式调用，因为这可能会错误地解释结构，如

if f
   a: 1

为

if f(a: 1)

这可能总是无意的。此外，不要在文字数组中允许这样做，因为这会造成语法歧义。

      if tag in IMPLICIT_FUNC and
         @indexOfTag(i + 1, 'INDENT') > -1 and @looksObjectish(i + 2) and
         not @findTagsBackwards(i, ['CLASS', 'EXTENDS', 'IF', 'CATCH',
          'SWITCH', 'LEADING_WHEN', 'FOR', 'WHILE', 'UNTIL'])
        startImplicitCall i + 1
        stack.push ['INDENT', i + 2]
        return forward(3)

¶

隐式对象从此处开始
```
      if tag is ':'
```

返回到对象的（隐式）开始处

        s = switch
          when @tag(i - 1) in EXPRESSION_END then start[1]
          when @tag(i - 2) is '@' then i - 2
          else i - 1
        s -= 2 while @tag(s - 2) is 'HERECOMMENT'

标记值是否为 for 循环

        @insideForDeclaration = nextTag is 'FOR'

        startsLine = s is 0 or @tag(s - 1) in LINEBREAKS or tokens[s - 1].newLine

我们是否只是在继续一个已声明的对象？

        if stackTop()
          [stackTag, stackIdx] = stackTop()
          if (stackTag is '{' or stackTag is 'INDENT' and @tag(stackIdx - 1) is '{') and
             (startsLine or @tag(s - 1) is ',' or @tag(s - 1) is '{')
            return forward(1)

        startImplicitObject(s, !!startsLine)
        return forward(2)

¶

在链接方法调用时结束隐式调用，例如
```
f ->
  a
.g b, ->
  c
.h a
```
以及
```
f a
.g b
.h a
```

将所有封闭对象标记为非 sameLine

      if tag in LINEBREAKS
        for stackItem in stack by -1
          break unless isImplicit stackItem
          stackItem[2].sameLine = no if isImplicitObject stackItem

      newLine = prevTag is 'OUTDENT' or prevToken.newLine
      if tag in IMPLICIT_END or tag in CALL_CLOSERS and newLine
        while inImplicit()
          [stackTag, stackIdx, {sameLine, startsLine}] = stackTop()

在到达参数列表的末尾时关闭隐式调用

          if inImplicitCall() and prevTag isnt ','
            endImplicitCall()

关闭隐式对象，例如：return a: 1, b: 2 unless true

          else if inImplicitObject() and not @insideForDeclaration and sameLine and
                  tag isnt 'TERMINATOR' and prevTag isnt ':'
            endImplicitObject()

在行尾关闭隐式对象，该行没有以逗号结尾，并且隐式对象没有开始该行，或者下一行看起来不像对象的延续。

          else if inImplicitObject() and tag is 'TERMINATOR' and prevTag isnt ',' and
                  not (startsLine and @looksObjectish(i + 1))
            return forward 1 if nextTag is 'HERECOMMENT'
            endImplicitObject()
          else
            break

如果逗号是最后一个字符，并且后面的内容看起来不像属于该对象，则关闭隐式对象。这用于尾随逗号和调用，例如

x =
    a: b,
    c: d,
e = 2

以及

f a, b: c, d: e, f, g: h: i, j

      if tag is ',' and not @looksObjectish(i + 1) and inImplicitObject() and
         not @insideForDeclaration and
         (nextTag isnt 'TERMINATOR' or not @looksObjectish(i + 2))

¶

当 nextTag 为 OUTDENT 时，逗号无关紧要，应该忽略，因此将其嵌入到隐式对象中。

当它不是逗号时，继续在堆栈中更上层的调用或数组中发挥作用，因此给它一个机会。
```
        offset = if nextTag is 'OUTDENT' then 1 else 0
        while inImplicitObject()
          endImplicitObject i + offset
      return forward(1)
```

将位置数据添加到重写器生成的所有标记。

  addLocationDataToGeneratedTokens: ->
    @scanTokens (token, i, tokens) ->
      return 1 if     token[2]
      return 1 unless token.generated or token.explicit
      if token[0] is '{' and nextLocation=tokens[i + 1]?[2]
        {first_line: line, first_column: column} = nextLocation
      else if prevLocation = tokens[i - 1]?[2]
        {last_line: line, last_column: column} = prevLocation
      else
        line = column = 0
      token[2] =
        first_line:   line
        first_column: column
        last_line:    line
        last_column:  column
      return 1

OUTDENT 标记应始终位于前一个标记的最后一个字符处，以便以 OUTDENT 标记结尾的 AST 节点最终具有与节点下最后一个“真实”标记相对应的位置。

  fixOutdentLocationData: ->
    @scanTokens (token, i, tokens) ->
      return 1 unless token[0] is 'OUTDENT' or
        (token.generated and token[0] is 'CALL_END') or
        (token.generated and token[0] is '}')
      prevLocationData = tokens[i - 1][2]
      token[2] =
        first_line:   prevLocationData.last_line
        first_column: prevLocationData.last_column
        last_line:    prevLocationData.last_line
        last_column:  prevLocationData.last_column
      return 1

因为我们的语法是 LALR(1)，所以它无法处理缺少结束分隔符的某些单行表达式。**Rewriter** 添加了隐式块，因此它不需要这样做。为了保持语法的简洁和整洁，表达式内的尾随换行符将被删除，并且空块的缩进标记将被添加。

  normalizeLines: ->
    starter = indent = outdent = null

    condition = (token, i) ->
      token[1] isnt ';' and token[0] in SINGLE_CLOSERS and
      not (token[0] is 'TERMINATOR' and @tag(i + 1) in EXPRESSION_CLOSE) and
      not (token[0] is 'ELSE' and starter isnt 'THEN') and
      not (token[0] in ['CATCH', 'FINALLY'] and starter in ['->', '=>']) or
      token[0] in CALL_CLOSERS and
      (@tokens[i - 1].newLine or @tokens[i - 1][0] is 'OUTDENT')

    action = (token, i) ->
      @tokens.splice (if @tag(i - 1) is ',' then i - 1 else i), 0, outdent

    @scanTokens (token, i, tokens) ->
      [tag] = token
      if tag is 'TERMINATOR'
        if @tag(i + 1) is 'ELSE' and @tag(i - 1) isnt 'OUTDENT'
          tokens.splice i, 1, @indentation()...
          return 1
        if @tag(i + 1) in EXPRESSION_CLOSE
          tokens.splice i, 1
          return 0
      if tag is 'CATCH'
        for j in [1..2] when @tag(i + j) in ['OUTDENT', 'TERMINATOR', 'FINALLY']
          tokens.splice i + j, 0, @indentation()...
          return 2 + j
      if tag in SINGLE_LINERS and @tag(i + 1) isnt 'INDENT' and
         not (tag is 'ELSE' and @tag(i + 1) is 'IF')
        starter = tag
        [indent, outdent] = @indentation tokens[i]
        indent.fromThen   = true if starter is 'THEN'
        tokens.splice i + 1, 0, indent
        @detectEnd i + 2, condition, action
        tokens.splice i, 1 if tag is 'THEN'
        return 1
      return 1

将后缀条件语句标记为这种形式，以便我们可以使用不同的优先级解析它们。

  tagPostfixConditionals: ->

    original = null

    condition = (token, i) ->
      [tag] = token
      [prevTag] = @tokens[i - 1]
      tag is 'TERMINATOR' or (tag is 'INDENT' and prevTag not in SINGLE_LINERS)

    action = (token, i) ->
      if token[0] isnt 'INDENT' or (token.generated and not token.fromThen)
        original[0] = 'POST_' + original[0]

    @scanTokens (token, i) ->
      return 1 unless token[0] is 'IF'
      original = token
      @detectEnd i + 1, condition, action
      return 1

根据同一行上的另一个标记生成缩进标记。

  indentation: (origin) ->
    indent  = ['INDENT', 2]
    outdent = ['OUTDENT', 2]
    if origin
      indent.generated = outdent.generated = yes
      indent.origin = outdent.origin = origin
    else
      indent.explicit = outdent.explicit = yes
    [indent, outdent]

  generate: generate

¶

通过标记索引查找标记。
```
  tag: (i) -> @tokens[i]?[0]
```
¶

常量
¶

必须平衡的标记对列表。

BALANCED_PAIRS = [
  ['(', ')']
  ['[', ']']
  ['{', '}']
  ['INDENT', 'OUTDENT'],
  ['CALL_START', 'CALL_END']
  ['PARAM_START', 'PARAM_END']
  ['INDEX_START', 'INDEX_END']
  ['STRING_START', 'STRING_END']
  ['REGEX_START', 'REGEX_END']
]

¶

我们正在尝试修复的 BALANCED_PAIRS 的反向映射，以便我们可以从任一端查找内容。
```
exports.INVERSES = INVERSES = {}
```

表示平衡对的开始/结束的标记。

EXPRESSION_START = []
EXPRESSION_END   = []

for [left, rite] in BALANCED_PAIRS
  EXPRESSION_START.push INVERSES[rite] = left
  EXPRESSION_END  .push INVERSES[left] = rite

表示表达式子句结束的标记。

EXPRESSION_CLOSE = ['CATCH', 'THEN', 'ELSE', 'FINALLY'].concat EXPRESSION_END

如果后面跟着 IMPLICIT_CALL，则表示函数调用的标记。

IMPLICIT_FUNC    = ['IDENTIFIER', 'PROPERTY', 'SUPER', ')', 'CALL_END', ']', 'INDEX_END', '@', 'THIS']

如果前面是 IMPLICIT_FUNC，则表示函数调用。

IMPLICIT_CALL    = [
  'IDENTIFIER', 'PROPERTY', 'NUMBER', 'INFINITY', 'NAN'
  'STRING', 'STRING_START', 'REGEX', 'REGEX_START', 'JS'
  'NEW', 'PARAM_START', 'CLASS', 'IF', 'TRY', 'SWITCH', 'THIS'
  'UNDEFINED', 'NULL', 'BOOL'
  'UNARY', 'YIELD', 'UNARY_MATH', 'SUPER', 'THROW'
  '@', '->', '=>', '[', '(', '{', '--', '++'
]

IMPLICIT_UNSPACED_CALL = ['+', '-']

始终标记单行表达式隐式调用的结束的标记。

IMPLICIT_END     = ['POST_IF', 'FOR', 'WHILE', 'UNTIL', 'WHEN', 'BY',
  'LOOP', 'TERMINATOR']

具有未闭合结尾的块表达式的单行形式。语法无法区分它们，因此我们插入隐式缩进。

SINGLE_LINERS    = ['ELSE', '->', '=>', 'TRY', 'FINALLY', 'THEN']
SINGLE_CLOSERS   = ['TERMINATOR', 'CATCH', 'FINALLY', 'ELSE', 'OUTDENT', 'LEADING_WHEN']

结束一行的标记。

LINEBREAKS       = ['TERMINATOR', 'INDENT', 'OUTDENT']

¶

在换行符之后，关闭打开的调用的标记。
```
CALL_CLOSERS     = ['.', '?.', '::', '?::']
```

常量