Skip to main content

Sink

sinks are used to define the medium of consuming the metadata being extracted. You need to specify at least one sink or can specify multiple sinks in a recipe, this will prevent you from having to create duplicate recipes for the same job. The given examples show you its correct usage if your sink is http and kafka.

Writing sinks part of your recipe

sinks: # required - at least 1 sink defined
- name: http
config:
method: POST
url: "https://example.com/metadata"
- name: kafka
config:
brokers: localhost:9092
topic: "target-topic"
key_path:
keyDescriptionrequirement
namecontains the name of sinkrequired
configdifferent sinks will require different configurationoptional, depends on sink

Available Sinks

  • Console
name: sample-recipe
sinks:
- name: console

Print metadata to stdout.

  • File
sinks:
name: file
config:
path: "./dir/sample.yaml"
format: "yaml"

Sinks metadata to a file in json/yaml format as per the config defined.

  • Google Cloud Storage
sinks:
- name: gcs
config:
project_id: google-project-id
url: gcs://bucket_name/target_folder
object_prefix : github-users
service_account_base64: <base64 encoded service account key>
service_account_json:
{
"type": "service_account",
"private_key_id": "xxxxxxx",
"private_key": "xxxxxxx",
"client_email": "xxxxxxx",
"client_id": "xxxxxxx",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "xxxxxxx",
"client_x509_cert_url": "xxxxxxx",
}

Sinks json data to a file as ndjson format in Google Cloud Storage bucket

  • http
sinks:
name: http
config:
method: POST
success_code: 200
url: https://compass.com/v1beta1/asset
headers:
Header-1: value11,value12

Sinks metadata to a http destination as per the config defined.

  • Stencil
sinks:
name: stencil
config:
host: https://stencil.com
namespace_id: myNamespace
schema_id: mySchema
format: json
send_format_header: false

Upload metadata of a given schema format in the existing namespace_id present in Stencil. Request will be sent via HTTP to a given host.

Upcoming sinks

  • HTTP
  • Kafka

Serializer

By default, metadata would be serialized into JSON format before sinking. To send it using other formats, a serializer needs to be defined in the sink config.

Custom Sink

Meteor has built-in sinks like Kafka and HTTP which users can just use directly. We will also allow creating custom sinks for DRY purposes.

It will be useful if you can find yourself sinking multiple metadata sources to one place.

Sample Custom Sink

  • central_metadata_store_sink.yaml
name: central-metadata-store # unique sink name as an ID
sink:
- name: http
config:
method: PUT
url: "https://metadata-store.com/metadata"

More info about available sinks can be found here.