LENGTH returns the graphemes (which is roughly equivalent to printed characters) in a string.
||The string whose length you want.|
||The encoding you want to use for calculating the string’s length. Defaults to UTF-8.|
length returns an Int.
length operations might return
NULL values indicating errors in the following cases:
- The encoding_name provided is not available in our encoding package.
- Some byte sequence in str was not compatible with the selected encoding.
Materialize returns the length of fixed-width strings as the maximum width of the string. For example
length on a
CHAR(15) column returns
15 as each string’s length.
Materialize receives strings from your database in the same format they are emitted. In the case of fixed-width strings, e.g.
CHAR columns in PostgreSQL, we receive the value padded by empty spaces. Because we cannot determine whether those spaces were intentional or an artifact of a fixed-width string, we provide the length of the string as we received it.
You can find any updates on this behavior in this GitHub issue.
- Materialize uses the
encodingcrate. See the list of supported encodings, as well as their names within the API.
For example, you can refer to
However, there are some differences in the names of the same encodings that we do not convert. For example, the windows-874 encoding is referred to as
WIN874in PostgreSQL; Materialize does not perform a conversion for these names.
SELECT length('你好') AS len;
len ----- 2
SELECT length('你好', 'big5') AS len;
len ----- 3