javascript-definitive-guide/notes/chap2_lexical_structure.md

2.4 KiB

Chapter 2. Lexical Structure

Cover:

  • Case sensitivity, space, and line breaks
  • Comments
  • Literals
  • Identifiers and reserved words
  • Unicode
  • Optional semicolons

2.1 The Text of a JS Program

  • JS is case-sensitive.
  • JS ignore space character (/u0020)
  • JS ignore line breaks
  • JS recognizes tabs, assorted ASCII control chars, and Unicode space characters as whitespaces
  • JS recognizes newlines, carriage returns, and a carriage return/line feed sequences as line terminators.

2.2 Comments

2 types of comments supported:

  • // comment content for single-line comment
  • /* comment content */ for multi-line comments

2.3 Literals

A literal (文字) is a data value that appears directly in a program. e.g.:

  • 12
  • 1.2
  • 'hello'
  • true

2.4 Identifiers and Reserved Words

  • Identifiers (标识符) is a just name to name constants, variables, functions, etc.
  • JS's identifiers start with
    • _ or
    • # or
    • letter

2.4.1 Reserved Words

JS reserved some keywords, check Mozilla reference

2.5 Unicode

JS programs (files) are written using Unicode character set, so any Unicode charater can be used in string/comments.

By convention, only ASCII letters/digits are used in identifiers

2.5.1 Unicode Escape Sequence

Some computer cannot use full set of Unicode char. For backward compatibility, JS defines escape sequence to write Unicode chars using only ASCII char.

How? begin char with \u and followed by four hex digits (before ES6) or {1 to 6 hex digit} (since ES6):

  • let cafe=1
  • caf\u00e9
  • caf\u{E9}

2.5.2 Unicode Normalization

non-ASCII char in Unicode can be encoded into different form (introduced before), which create problem due to conflict. Hence Unicode normalization is required, using some tools or editor

2.6 Optional Semicolons

  • JS use ; to separate statements:
    • a=3 ; b=4
  • JS can omit semicolon if two statments are written on separate lines.
    • You can insist to write ; for clear separation
    • JS treats a line break as a semicolon if the next nonspace char cannot be interpreted as a continuation of the current statement.

Miss treatment of newline:

let y = x + f
(a+b).toString()

will be treated by JS as

let y = x + f(a+b).toString();