ECMAScript is used to implement a variety of tools that check code for conformance with the ECMAScript specification, minimize it, perform other transformations, or generate ECMAScript code. These tools have to be able to check identifiers for conformance, taking the identifier specification and the underlying Unicode specification into consideration. This currently requires the tools to include large regular expressions or tables. When tools bring their own data, they likely will support only one ECMAScript/Unicode version, and there’s no guarantee that their data will match the identifier definition of the runtime they’re running, as implementations are free to support any Unicode version higher than the minimum version required by the ECMAScript specification.
In general, Unicode character properties can be supported through code point classification functions or through regular expression patterns. In the case of ECMAScript parsers, it seems classification functions are more useful:
|Parser (of)||Tokenizer||non-ASCII characters||# of ES/Unicode versions||ES version||Unicode version||Unicode escapes|
|CoffeeScript||RegExp||wrong - accepts 0x7F-0xFFFF||—||?||?||no|
Add the following functions, which detect identifier characters based on either the minimum Unicode version of a specified ECMAScript edition, or the Unicode version used by the implementation.
String.isIdentifierStart(cp [, edition])
String.isIdentifierPart(cp [, version])
This function behaves in exactly the same way as String.isIdentifierStart, except that the return value is based on whether cp is matched by the IdentifierPart production.