83 lines
2.4 KiB
Markdown
83 lines
2.4 KiB
Markdown
# Chapter 2. Lexical Structure
|
|
|
|
Cover:
|
|
* Case sensitivity, space, and line breaks
|
|
* Comments
|
|
* Literals
|
|
* Identifiers and reserved words
|
|
* Unicode
|
|
* Optional semicolons
|
|
|
|
## 2.1 The Text of a JS Program
|
|
|
|
* JS is **case-sensitive**.
|
|
* JS ignore space character (`/u0020`)
|
|
* JS ignore line breaks
|
|
* JS recognizes tabs, assorted ASCII control chars, and Unicode space characters as whitespaces
|
|
* JS recognizes newlines, carriage returns, and a carriage return/line feed sequences as line terminators.
|
|
|
|
## 2.2 Comments
|
|
|
|
2 types of comments supported:
|
|
* `// comment content` for single-line comment
|
|
* `/* comment content */` for multi-line comments
|
|
|
|
## 2.3 Literals
|
|
|
|
A **literal** (文字) is a data value that appears directly in a program. e.g.:
|
|
* `12`
|
|
* `1.2`
|
|
* `'hello'`
|
|
* `true`
|
|
|
|
## 2.4 Identifiers and Reserved Words
|
|
|
|
* **Identifiers (标识符)** is a just name to name constants, variables, functions, etc.
|
|
* JS's identifiers start with
|
|
* `_` or
|
|
* `#` or
|
|
* letter
|
|
|
|
### 2.4.1 Reserved Words
|
|
|
|
JS reserved some keywords, check [Mozilla reference](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Lexical_grammar#Reserved_keywords_as_of_ECMAScript_2015)
|
|
|
|
## 2.5 Unicode
|
|
|
|
JS programs (files) are written using Unicode character set, so any Unicode charater can be used in string/comments.
|
|
|
|
By convention, only ASCII letters/digits are used in identifiers
|
|
|
|
### 2.5.1 Unicode Escape Sequence
|
|
|
|
Some computer cannot use full set of Unicode char. For backward compatibility, JS defines **escape sequence** to write Unicode chars using only ASCII char.
|
|
|
|
How? begin char with `\u` and followed by four hex digits (before ES6) or `{1 to 6 hex digit}` (since ES6):
|
|
* `let cafe=1`
|
|
* `caf\u00e9`
|
|
* `caf\u{E9}`
|
|
|
|
### 2.5.2 Unicode Normalization
|
|
|
|
non-ASCII char in Unicode can be encoded into different form (introduced before), which create problem due to conflict. Hence Unicode normalization is required, using some tools or editor
|
|
|
|
### 2.6 Optional Semicolons
|
|
|
|
* JS use `;` to separate statements:
|
|
* `a=3 ; b=4`
|
|
* JS can omit semicolon if two statments are written on separate lines.
|
|
* You can insist to write `;` for clear separation
|
|
* JS treats a line break as a semicolon if the next nonspace char cannot be interpreted as a continuation of the current statement.
|
|
|
|
Miss treatment of newline:
|
|
|
|
```js
|
|
let y = x + f
|
|
(a+b).toString()
|
|
```
|
|
|
|
will be treated by JS as
|
|
|
|
```js
|
|
let y = x + f(a+b).toString();
|
|
``` |