From 4942e000b81b4ad36baa1359646be30940bb090c Mon Sep 17 00:00:00 2001 From: JasonHomeWorkstationUbuntu Date: Thu, 17 Dec 2020 02:50:47 +0000 Subject: [PATCH] Finished Chap2 --- notes/chap2_lexical_structure.md | 83 ++++++++++++++++++++++++++++++++ 1 file changed, 83 insertions(+) create mode 100644 notes/chap2_lexical_structure.md diff --git a/notes/chap2_lexical_structure.md b/notes/chap2_lexical_structure.md new file mode 100644 index 0000000..b53effa --- /dev/null +++ b/notes/chap2_lexical_structure.md @@ -0,0 +1,83 @@ +# Chapter 2. Lexical Structure + +Cover: +* Case sensitivity, space, and line breaks +* Comments +* Literals +* Identifiers and reserved words +* Unicode +* Optional semicolons + +## 2.1 The Text of a JS Program + +* JS is **case-sensitive**. +* JS ignore space character (`/u0020`) +* JS ignore line breaks +* JS recognizes tabs, assorted ASCII control chars, and Unicode space characters as whitespaces +* JS recognizes newlines, carriage returns, and a carriage return/line feed sequences as line terminators. + +## 2.2 Comments + +2 types of comments supported: +* `// comment content` for single-line comment +* `/* comment content */` for multi-line comments + +## 2.3 Literals + +A **literal** (文字) is a data value that appears directly in a program. e.g.: +* `12` +* `1.2` +* `'hello'` +* `true` + +## 2.4 Identifiers and Reserved Words + +* **Identifiers (标识符)** is a just name to name constants, variables, functions, etc. +* JS's identifiers start with + * `_` or + * `#` or + * letter + +### 2.4.1 Reserved Words + +JS reserved some keywords, check [Mozilla reference](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Lexical_grammar#Reserved_keywords_as_of_ECMAScript_2015) + +## 2.5 Unicode + +JS programs (files) are written using Unicode character set, so any Unicode charater can be used in string/comments. + +By convention, only ASCII letters/digits are used in identifiers + +### 2.5.1 Unicode Escape Sequence + +Some computer cannot use full set of Unicode char. For backward compatibility, JS defines **escape sequence** to write Unicode chars using only ASCII char. + +How? begin char with `\u` and followed by four hex digits (before ES6) or `{1 to 6 hex digit}` (since ES6): +* `let cafe=1` +* `caf\u00e9` +* `caf\u{E9}` + +### 2.5.2 Unicode Normalization + +non-ASCII char in Unicode can be encoded into different form (introduced before), which create problem due to conflict. Hence Unicode normalization is required, using some tools or editor + +### 2.6 Optional Semicolons + +* JS use `;` to separate statements: + * `a=3 ; b=4` +* JS can omit semicolon if two statments are written on separate lines. + * You can insist to write `;` for clear separation + * JS treats a line break as a semicolon if the next nonspace char cannot be interpreted as a continuation of the current statement. + +Miss treatment of newline: + +```js +let y = x + f +(a+b).toString() +``` + +will be treated by JS as + +```js +let y = x + f(a+b).toString(); +``` \ No newline at end of file