CPP Info Memo part1(2)

tubo posted @ 2014年9月04日 17:32 in 未分类 , 763 阅读

1 Overview

It always makes each token, starting from the left, as big as possible before moving on to the next token. For instance,

a+++++b

is interpreted as

a ++ ++ + b

, not as

a ++ + ++ b

, even though the latter tokenization could be part of a valid C program and the former could not.

丛左到右，尽可能多。

一旦文件被分解成符号，符号的边界也就固定不变了，除非通过 "##" 来将两个符号连接在一起。

例如定义：

#define foo() bar

此时：

foo()baz

预编译时，将会得到的 Tokens 为： bar baz，而不是 barbaz 。

预编译输入的符号，只能作为编译器的输入，而不能再重新作为预编译的输入。

预处理输出的 Tokens 可以分为五类：标识符，数字、字符串、符号、以及其他。

标识符 (Indentifiers)

定义与 C 的标识符相同： any sequence of letters, digits, or underscores, which begins with a letter or underscore.

对于预编译来讲， C 语言的关键词 (keyword) 和普通的 Token 没有什么区别，除了关键词 defined 之外。