Problem/Motivation
Database identifiers resolution has grown recently, with escaping, caching and introduction of the square brackets syntax to resolve the problem of generic SQL statements to be ported across platforms using different quoting characters. This on top of the concept of prefixing tables.
All this has been added on the Connection class that is IMHO a bit overdoing.
Also, some platforms are hitting identifier length issues like e.g. #571548: Identifiers longer than 63 characters are truncated, causing Views to break on Postgres.
Developers should not be worried by exceeding identifiers length or using wrong identifiers; the database API should be self-sufficient in receiving input and cleansing/transforming it into machine usable information; failing as a last resort.
Proposed resolution
Decouple identifier management, including prefix management, in a class of its own, that platforms can override to manage their own restrictions.
Therefore:
- Table names, database name, column names, index names in a db are 'identifiers'.
- 'Identifiers' are subject to db-specific limitations (length, chars allowed, etc)
- In Drupal, 'Identifiers' are value objects. Each identifier has:
- a type (e.g. table, column, alias, etc)
- a raw value that include non SQL allowed characters and/or the quote characters, also in a
<database>.<schema>.<table>
format (e.g. config, key_value, mydb."key_value", with$$invalidchar, etc.) - a canonical value, i.e. the raw value cleansed of unallowed characters (from the example above: config, key_value, mydb.key_value, withinvalidchar, etc.)
- a machine-usable value, i.e. the canonical value transformed to be used on the db-specific SQL query - including applying quote chars, prefixing, and shortening if necessary (from the example above, and assuming a prefix 'abba': "abbaconfig", "abbakey_value", "mydb"."key_value", "abbawithinvalidchar", etc.)
- each db-driver implements an 'IdentifierHandling' class that parses/transforms a raw value into the relevant value object, including a local cache to prevent processing again and again each raw value.
In this issue, only focus on table identifiers, which also require database and schema identifiers to be fleshed out. In follow ups, this approach could be extended from tables to columns, aliases, indexes etc.