From 4942e000b81b4ad36baa1359646be30940bb090c Mon Sep 17 00:00:00 2001
From: JasonHomeWorkstationUbuntu <jasonzhuyq@outlook.com>
Date: Thu, 17 Dec 2020 02:50:47 +0000
Subject: [PATCH] Finished Chap2

---
 notes/chap2_lexical_structure.md | 83 ++++++++++++++++++++++++++++++++
 1 file changed, 83 insertions(+)
 create mode 100644 notes/chap2_lexical_structure.md

diff --git a/notes/chap2_lexical_structure.md b/notes/chap2_lexical_structure.md
new file mode 100644
index 0000000..b53effa
--- /dev/null
+++ b/notes/chap2_lexical_structure.md
@@ -0,0 +1,83 @@
+# Chapter 2. Lexical Structure
+
+Cover:
+* Case sensitivity, space, and line breaks
+* Comments
+* Literals
+* Identifiers and reserved words
+* Unicode
+* Optional semicolons
+
+## 2.1 The Text of a JS Program
+
+* JS is **case-sensitive**.
+* JS ignore space character (`/u0020`)
+* JS ignore line breaks
+* JS recognizes tabs, assorted ASCII control chars, and Unicode space characters as whitespaces
+* JS recognizes newlines, carriage returns, and a carriage return/line feed sequences as line terminators.
+
+## 2.2 Comments
+
+2 types of comments supported:
+* `// comment content` for single-line comment
+* `/* comment content */` for multi-line comments
+
+## 2.3 Literals
+
+A **literal** (文字) is a data value that appears directly in a program. e.g.:
+* `12`
+* `1.2`
+* `'hello'`
+* `true`
+
+## 2.4 Identifiers and Reserved Words
+
+* **Identifiers (标识符)** is a just name to name constants, variables, functions, etc.
+* JS's identifiers start with
+  * `_` or 
+  * `#` or
+  * letter
+
+### 2.4.1 Reserved Words
+
+JS reserved some keywords, check [Mozilla reference](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Lexical_grammar#Reserved_keywords_as_of_ECMAScript_2015)
+
+## 2.5 Unicode
+
+JS programs (files) are written using Unicode character set, so any Unicode charater can be used in string/comments.
+
+By convention, only ASCII letters/digits are used in identifiers
+
+### 2.5.1 Unicode Escape Sequence
+
+Some computer cannot use full set of Unicode char. For backward compatibility, JS defines **escape sequence** to write Unicode chars using only ASCII char.
+
+How? begin char with `\u` and followed by four hex digits (before ES6) or `{1 to 6 hex digit}` (since ES6):
+* `let cafe=1`
+* `caf\u00e9`
+* `caf\u{E9}`
+
+### 2.5.2 Unicode Normalization
+
+non-ASCII char in Unicode can be encoded into different form (introduced before), which create problem due to conflict. Hence Unicode normalization is required, using some tools or editor
+
+### 2.6 Optional Semicolons
+
+* JS use `;` to separate statements:
+  * `a=3 ; b=4`
+* JS can omit semicolon if two statments are written on separate lines.
+  * You can insist to write `;` for clear separation
+  * JS treats a line break as a semicolon if the next nonspace char cannot be interpreted as a continuation of the current statement.
+
+Miss treatment of newline:
+
+```js
+let y = x + f
+(a+b).toString()
+```
+
+will be treated by JS as
+
+```js
+let y = x + f(a+b).toString();
+```
\ No newline at end of file