CREATE SOURCE: JSON over Kinesis
CREATE SOURCE connects Materialize to some data source, and lets you interact
with its data as if it were in a SQL table.
This document details how to connect Materialize to JSON-formatted Kinesis streams.
Sources represent connections to resources outside Materialize that it can read data from. For more information, see API Components: Sources.
|MATERIALIZED||Materializes the source’s data, which retains all data in memory and makes sources directly selectable. For more information, see Materialized source details.|
|src_name||The name for the source, which is used as its table name within SQL.|
|col_name||Override default column name with the provided identifier. If used, a col_name must be provided for each column in the created source.|
|KINESIS ARN arn||The AWS ARN of the Kinesis Data Stream.|
|WITH ( option_list )||Options affecting source creation. For more detail, see
|FORMAT BYTES||Materialize receives data as raw bytes, but you can easily transform the data to JSON. For more details, see Byte format details (JSON).|
|ENVELOPE NONE||(Default) Use an append-only envelope. This means that records will only be appended and cannot be updated or deleted.|
The following options are valid within the
||A valid access key ID to the Kinesis stream.|
||A valid secret access key to the Kinesis stream.|
||The session token associated with the credentials, if the credentials are temporary|
For details about the IAM account whose details you provide, see Kinesis source details.
If you do not explicitly provide AWS credentials, Materialize will attempt to fetch credentials by reading standard environment variables, filesystem paths, and, if running on AWS ECS or EC2, the container or instance profile. Consult the Rusoto credentials documentation for details.
Credentials fetched from a container or instance profile expire on a fixed schedule. Materialize will attempt to refresh the credentials automatically before they expire, but the source will become inoperable if the refresh operation fails.
Materialized source details
Materializing a source keeps data it receives in an in-memory
index, the presence of which makes the
source directly queryable. In contrast, non-materialized sources cannot process
queries directly; to access the data the source receives, you need to create
materialized views that
SELECT from the
For a mental model, materializing the source is approximately equivalent to creating a non-materialized source, and then creating a materialized view from all of the source’s columns:
CREATE SOURCE src ...; CREATE MATERIALIZED VIEW src_view AS SELECT * FROM src;
The actual implementation of materialized sources differs, though, by letting you refer to the source’s name directly in queries.
For more details about the impact of materializing sources (and implicitly
creating an index), see
CREATE INDEX: Details — Memory
Kinesis source details
- Materialize expects each source to use to one Kinesis stream, which is—in turn—generated by a single table in an upstream database.
- By default, Materialize will try to read credentials automatically via Rusoto’s ChainProvider. If credentials are explicitly provided, those will be used instead.
- The IAM account whose credentials you provide requires
kinesis-readpermissions and access to
- Kinesis sources will only have one column, which will be named
Byte format details (JSON)
Here is an example of using a Kinesis source, converting the streams bytes into
CREATE MATERIALIZED VIEW jsonified_kinesis_source AS SELECT CAST(data AS JSONB) AS data FROM ( SELECT CONVERT_FROM(data, 'utf8') AS data FROM kinesis_source )
Append-only envelope means that all records received by the source is treated as an insert. This is Materialize’s default envelope (i.e. if no envelope is specified), and can be specified with ENVELOPE NONE.
CREATE SOURCE kinesis_source FROM KINESIS ARN ... WITH ( access_key_id = ..., secret_access_key = ... ) FORMAT BYTES;
This creates a source that…
- Is append-only.
- Has one column,
data, which represents the stream’s incoming bytes.
To use this data in views, you can decode its bytes into
jsonb. For example:
CREATE MATERIALIZED VIEW jsonified_kinesis_source AS SELECT CAST(data AS jsonb) AS data FROM ( SELECT convert_from(data, 'utf8') AS data FROM kinesis_source )