Transforming logs with Vector

Working in a new Kubernetes based system and the various pods are all running different applications and the requirements are

  • First push the logs out
  • Standardized the output of each log from all the various applications

Introducing Vector to the rescue as a data pipeline that can perform in place transformations.

Steps (bash/zsh)

  • Get the latest docker image docker pull timberio/vector:0.50.0-debian
  • Create an alias alias vector='docker run -i -v $(PWD)/:/etc/vector/ timberio/vector:0.50.0-debian'
  • Add a vector.yaml config file that reads stdin->transforms (adds a field) -> sends back to stdout

This example is defining the pipeline from a source to sink and in the middle does a transform. Here the parse_json pulls the input line in and then places it into the variable . and the next line we add a new field timestamp with and call the bash function now to insert the time in UTC

sources:
  stdin_input:
    type: stdin

transforms:
  remap_transform:
    type: remap
    inputs: [stdin_input]
    source: |
      . = parse_json!(.message)
      .timestamp = now()

sinks:
  output_console:
    type: console
    inputs: [remap_transform]
    encoding:
      codec: json
  • Test it out echo '{"message": "\Happy\", \"Place\": \"Mittens\", \"On\": \"Kittens\"}' | vector --config vector.yaml

Results

INFO vector::app: Log level is enabled. level=info
INFO vector::app: Loading configs. path="/etc/vector/vector.yaml"
INFO vector::sources::stdin: Capturing stdin
{"message":"Happy","Place":"Mittens","On":"Kittens","timestamp":"2025-10-29T21:02:12.249699Z"}