Update module github.com/Jeffail/benthos/v3 to v4 (!4) · Merge requests · open / lineflux

autocafe requested to merge renovate/github.com-jeffail-benthos-v3-4.x into master Mar 26, 2024

This MR contains the following updates:

Package	Type	Update	Change
github.com/Jeffail/benthos/v3	require	major	`v3.65.0` -> `v4.38.0`

Release Notes

Jeffail/benthos (github.com/Jeffail/benthos/v3)

`v4.38.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Anonymous telemetry data is now sent by Connect instances after running for >5 mins. Details about which data is sent, when it is sent, and how to disable it can be found in the telemetry README. (@Jeffail)
Field checksum_algorithm added to the aws_s3 output. (@dom-lee-naimuri)
Field nkey added to nats, nats_jetstream, nats_kv and nats_stream components. (@ye11ow)
Field private_key added to the snowflake_put output. (@mihaitodor)
New azure_data_lake_gen2 output. (@ooesili)
New timeplus output. (@ye11ow)

Fixed

The elasticsearch output now performs retries for HTTP status code 429 (Too Many Requests). (@kahoowkh)
The docs for the collection field of the mongodb output now specify support for interpolation functions. (@mihaitodor)

Changed

All components with a default path field value (such as the aws_s3 output) containing the deprecated function count have now been changed to use the new function counter. This could potentially change behaviour in cases where multiple components are executing a mapping with a count function sharing the same of the old default count, and these counters need to cascade. This is an extremely unlikely scenario, but for all users of these components it is recommended that your path is defined explicitly, and in a future major version we will be removing the defaults.

The full change log can be found here.

`v4.37.0`

Compare Source

For installation instructions check out the getting started guide.

Added

New experimental gcp_vertex_ai_embeddings processor. (@rockwotj)
New experimental aws_bedrock_embeddings processor. (@rockwotj)
New experimental cohere_chat and cohere_embeddings processors. (@rockwotj)
New experimental questdb output. (@sklarsa)
Field metadata_max_age added to the kafka_franz input. (@Scarjit)
Field metadata_max_age added to the kafka_migrator input. (@mihaitodor)
New experimental cypher output. (@rockwotj)
New experimental couchbase output. (@rockwotj)
Field fetch_in_order added to the schema_registry input. (@mihaitodor)

Fixed

Fixed a bug with the input_resource field for the kafka_migrator output where new topics weren't created as expected. (@mihaitodor)
Fixed a bug in the kafka_migrator input which could lead to extra duplicate messages during a consumer group rebalance. (@mihaitodor)
kafka_migrator, kafka_migrator_offsets and kafka_migrator_bundle components renamed to redpanda_migrator, redpanda_migrator_offsets and redpanda_migrator_bundle (@mihaitodor)

Fixed

Fixes a panic in the parquet_encode processor (@mihaitodor)

The full change log can be found here.

`v4.36.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Fields replication_factor and replication_factor_override added to the kafka_migrator input and output. (@mihaitodor)

Fixed

The schema_registry_encode and schema_registry_decode processors no longer unescape path separators in the schema name. (@Mizaro)
(Benthos) The switch output metrics now emit the case id as part of their labels. This is a regression introduced in v4.25.0. (@mihaitodor)
(Benthos) Fixed a bug where certain logs used the %w verb to print errors resulting in incorrect output. (@mihaitodor)
(Benthos) The logger no longer tries to replace Go fmt verbs in log messages. (@mihaitodor)

The full change log can be found here.

`v4.35.3`

Compare Source

For installation instructions check out the getting started guide.

Added

Azure and GCP components added to cloud builds. (@Jeffail)

Fixed

The kafka_migrator_bundle input and output no longer require schema registry to be configured. (@mihaitodor)

The full change log can be found here.

`v4.35.2`

Compare Source

For installation instructions check out the getting started guide.

Added

Azure and GCP components added to cloud builds. (@Jeffail)

Fixed

The kafka_migrator_bundle input and output no longer require schema registry to be configured. (@mihaitodor)

The full change log can be found here.

`v4.35.1`

Compare Source

For installation instructions check out the getting started guide.

Added

Azure and GCP components added to cloud builds. (@Jeffail)

Fixed

The kafka_migrator_bundle input and output no longer require schema registry to be configured. (@mihaitodor)

The full change log can be found here.

`v4.35.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Auth fields added to the schema_registry input and output. (@mihaitodor)
New experimental kafka_migrator and kafka_migrator_bundle inputs and outputs. (@mihaitodor)
New experimental kafka_migrator_offsets output. (@mihaitodor)
Field job_project added to the gcp_bigquery output. (@Roviluca)

The full change log can be found here.

`v4.34.0`

Compare Source

For installation instructions check out the getting started guide.

Fixed

The schema_registry output now allows pushing schemas if the target Schema Registry instance is in IMPORT mode. (@mihaitodor)
Fixed an issue where the azure_blob_storage input would fail to delete blobs when using targets_input with delete_objects: true. (@mihaitodor)
New experimental gcp_vertex_ai_chat processor. (@rockwotj)
New experimental aws_bedrock_chat processor. (@rockwotj)

The full change log can be found here.

`v4.33.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Field content_md5 added to the aws_s3 output. (@dom-lee-naimuri)
Field send_ack added to the nats input. (@plejd-sebman)
New Bloblang method vector. (@rockwotj)
New experimental ockam_kafka input and output. (@mrinalwadhwa, @davide-baldo)
Field credentials_json added to all GCP components. (@tomasz-sadura)
(Benthos) The list subcommand now supports the format jsonschema. (@Jeffail)
New experimental schema_registry input and output. (@mihaitodor)
New experimental qdrant output. (@Anush008)
(Benthos) The --set run flag now supports structured values, e.g. --set input={}. (@Jeffail)

The full change log can be found here.

`v4.32.1`

Compare Source

For installation instructions check out the getting started guide.

Added

Field app_name added to the MongoDB components. (@mihaitodor)
New openai_chat_completion processor. (@rockwotj)
New openai_embeddings processor. (@rockwotj)
New openai_image_generation processor. (@rockwotj)
New openai_speech processor. (@rockwotj)
New openai_transcription processor. (@rockwotj)
New openai_translation processor. (@rockwotj)
New ollama_chat processor. (@rockwotj)
New ollama_embeddings processor. (@rockwotj)

Changed

The gcp_pubsub output now rejects messages with metadata values which contain invalid UTF-8-encoded runes. (@AndreasBergmeier6176)
The .goreleaser.yml configuration has been set back to version 1. (@Jeffail)
The number of release build artifacts for the community and cloud flavours have been reduced due to Github Action Runner disk space limitations.

The full change log can be found here.

`v4.32.0`

Compare Source

`v4.31.0`

Compare Source

For installation instructions check out the getting started guide.

4.31.0 - 2024-07-19

Added

The splunk input and splunk_hec output now support custom tls configuration. (@mihaitodor)
Field timestamp added to the kafka and kafka_franz outputs. (@mihaitodor)
(Benthos) Field max_retries added to the retry processor. (@mihaitodor)
(Benthos) Metadata fields retry_count and backoff_duration added to the retry processor. (@mihaitodor)
(Benthos) Parameter escape_html added to the format_json() Bloblang method. (@mihaitodor)
(Benthos) New array bloblang method. (@gramian)
(Benthos) Algorithm fnv32 added to the hash bloblang method. (@CallMeMhz)
New experimental redpanda_data_transform. (@rockwotj)
New -community suffixed build included in release artifacts, containing only FOSS functionality. (@Jeffail)
New -cloud suffixed build included in release artifacts, containing components enabled in Redpanda Cloud. (@Jeffail)
Field status_topic added to the global redpanda config block. (@Jeffail)
New pinecone output. (@rockwotj)
(Benthos) The /ready endpoint in regular operation now provides a detailed summary of all inputs and outputs, including connection errors where applicable. (@Jeffail)

Changed

(Benthos) All cli subcommands that previously relied on root-level flags (streams, lint, test, echo) now explicitly define those flags such that they appear in help-text and can be specified after the subcommand itself. This means previous commands such as connect -r ./foo.yaml streams ./bar.yaml can now be more intuitively written as connect streams -r ./foo.yaml ./bar.yaml and so on. The old style will still work in order to preserve backwards compatibility, but the help-text for these root-level flags has been hidden. (@Jeffail)

`v4.30.1`

Compare Source

For installation instructions check out the getting started guide.

Added

(Benthos) Field omit_empty added to the lines scanner. (@mihaitodor)
(Benthos) New scheme gcm added to the encrypt_aes and decrypy_aes Bloblang methods. (@abergmeier)
(Benthos) New Bloblang method pow. (@mfamador)
(Benthos) New sin, cos, tan and pi bloblang methods. (@mfamador)
(Benthos) Field proxy_url added to the websocket input and output. (@mihaitodor)
New experimental splunk input. (@mihaitodor)

Fixed

The sql_insert and sql_raw components no longer fail when inserting large binary blobs into Oracle BLOB columns. (@mihaitodor)
(Benthos) The websocket input and output now obey the HTTP_PROXY, HTTPS_PROXY and NO_PROXY environment variables. (@mihaitodor)
AWS Lambda serverless build artifacts have been added back to official releases.

Changed

The splunk_hec output is now implemented as a native Go component. (@mihaitodor)

The full change log can be found here.

`v4.30.0`

Compare Source

For installation instructions check out the getting started guide.

Added

(Benthos) Field omit_empty added to the lines scanner. (@mihaitodor)
(Benthos) New scheme gcm added to the encrypt_aes and decrypy_aes Bloblang methods. (@abergmeier)
(Benthos) New Bloblang method pow. (@mfamador)
(Benthos) New sin, cos, tan and pi bloblang methods. (@mfamador)
(Benthos) Field proxy_url added to the websocket input and output. (@mihaitodor)
New experimental splunk input. (@mihaitodor)

Fixed

The sql_insert and sql_raw components no longer fail when inserting large binary blobs into Oracle BLOB columns. (@mihaitodor)
(Benthos) The websocket input and output now obey the HTTP_PROXY, HTTPS_PROXY and NO_PROXY environment variables. (@mihaitodor)

Changed

The splunk_hec output is now implemented as a native Go component. (@mihaitodor)

The full change log can be found here.

`v4.29.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Go API: New packages public/bundle/free and public/bundle/enterprise with explicit licensing for bundles of component imports.
Field auth.oauth2.scope added to the pulsar input and output. (@srenatus)
Field subscription_initial_position added to the pulsar input. (@srenatus)

Fixed

The pulsar input and output should no longer ignore auth.oauth2 fields. (@srenatus)
Creating builds using make no longer prints warnings when the repository does not contain a tag. (@mkysel)
Messages resulting from the redis processor are no longer invalid when using hash commands. (@mkysel)
The nats_jetstream input no longer fails to initialise when a stream is specified and a subject is not. (@maxarndt)

The full change log can be found here.

`v4.28.0`

Compare Source

For installation instructions check out the getting started guide.

Changed

The repository has been moved to redpanda-data/connect and no longer contains the core Benthos engine, which is now broken out into redpanda-data/benthos.

The full change log can be found here.

`v4.27.0`

Compare Source

For installation instructions check out the getting started guide.

Added

New nats_kv cache type.
The nats_jetstream input now supports last_per_subject and new deliver fallbacks.
Field error_patterns added to the drop_on output.
New redis_scan input type.
Field auto_replay_nacks added to all inputs that traditionally automatically retry nacked messages as a toggle for this behaviour.
New retry processor.
New noop cache.
Field targets_input added to the azure_blob_storage input.
New reject_errored output.
New nats_request_reply processor.
New json_documents scanner.

Fixed

The unarchive processor no longer yields linting errors when the format csv:x is specified. This is a regression introduced in v4.25.0.
The sftp input will no longer consume files when the watcher cache returns an error. Instead, it will reattempt the file upon the next poll.
The aws_sqs input no longer logs error level logs for visibility timeout refreshing errors.
The nats_kv processor now allows nats wildcards for the keys operation.
The nats_kv processor keys operation now returns a single message with an array of found keys instead of a batch of messages.
The nats_kv processor history operation now returns a single message with an array of objects containing the record fields instead of a batch of messages.
Field timeout added to the nats_kv processor to specify the maximum period to wait on an operation before aborting and returning an error.
Bloblang comparison operators (>, <, <=, >=) now match the precision of the compared integers when applicable.
The parse_form_url_encoded Bloblang method no longer produces results with an unknown data type for repeated query parameters.
The echo CLI command no longer fails to sanitise configs when encountering an empty password field.

Changed

The log events from all inputs and outputs when they first connect have been made more consistent and no longer contain any information regarding the nature of their connections.
Splitting message batches with a split processor (or custom plugins) no longer results in downstream error handling loops around nacks. This was previously implemented as a feature to ensure unbounded expanded and split batches don't flood downstream services in the event of a minority of errors. However, introducing more clever origin tracking of errored messages has eliminated the need for this undocumented behaviour.

The full change log can be found here.

`v4.26.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Field credit added to the amqp_1 input to specify the maximum number of unacknowledged messages the sender can transmit.
Bloblang now supports root-level if statements.
New experimental sql cache.
Fields batch_size, sort and limit added to the mongodb input.
Field idemponent_write added to the kafka output.

Changed

The default value of the amqp_1.credit input has changed from 1 to 64.
The mongodb processor and output now support extended JSON in canonical form for document, filter and hint mappings.
The open_telemetry_collector tracer has had the url field of gRPC and HTTP collectors deprecated in favour of address, which more accurately describes the intended format of endpoints. The old style will continue to work, but eventually will have its default value removed and an explicit value will be required.

Fixed

Resource config imports containing % characters were being incorrectly parsed during unit test execution. This was a regression introduced in v4.25.0.
Dynamic input and output config updates containing % characters were being incorrectly parsed. This was a regression introduced in v4.25.0.

The full change log can be found here.

`v4.25.1`

Compare Source

For installation instructions check out the getting started guide.

Fixed

Fixed a regression in v4.25.0 where template based components were not parsing correctly from configs.

The full change log can be found here.

`v4.25.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Field address_cache added to the socket_server input.
Field read_header added to the amqp_1 input.
All inputs with a codec field now support a new field scanner to replace it. Scanners are more powerful as they are configured in a structured way similar to other component types rather than via a single string field, for more information check out the scanners page.
New diff and patch Bloblang methods.
New processors processor.
Field read_header added to the amqp_1 input.
A debug endpoint /debug/pprof/allocs has been added for profiling allocations.
New cockroachdb_changefeed input.
The open_telemetry_collector tracer now supports sampling.
The aws_kinesis input and output now support specifying ARNs as the stream target.
New azure_cosmosdb input, processor and output.
All sql_* components now support the gocosmos driver.
New opensearch output.

Fixed

The javascript processor now handles module imports correctly.
Bloblang if statements now provide explicit errors when query expressions resolve to non-boolean values.
Some metadata fields from the amqp_1 input were always empty due to type mismatch, this should no longer be the case.
The zip Bloblang method no longer fails when executed without arguments.
The amqp_0_9 output no longer prints bogus exchange name when connecting to the server.
The generate input no longer adds an extra second to interval: '@every x' syntax.
The nats_jetstream input no longer fails to locate mirrored streams.
Fixed a rare panic in batching mechanisms with a specified period, where data arrives in low volumes and is sporadic.
Executing config unit tests should no longer fail due to output resources failing to connect.

Changed

The parse_parquet Bloblang function, parquet_decode, parquet_encode processors and the parquet input have all been upgraded to the latest version of the underlying Parquet library. Since this underlying library is experimental it is likely that behaviour changes will result. One significant change is that encoding numerical values that are larger than the column type (float64 into FLOAT, int64 into INT32, etc) will no longer be automatically converted.
The parse_log processor field codec is now deprecated.
WARNING: Many components have had their underlying implementations moved onto newer internal APIs for defining and extracting their configuration fields. It's recommended that upgrades to this version are performed cautiously.
WARNING: All AWS components have been upgraded to the latest client libraries. Although lots of testing has been done, these libraries have the potential to differ in discrete ways in terms of how credentials are evaluated, cross-account connections are performed, and so on. It's recommended that upgrades to this version are performed cautiously.

The full change log can be found here.

`v4.24.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Field idempotent_write added to the kafka_franz output.
Field idle_timeout added to the read_until input.
Field delay_seconds added to the aws_sqs output.
Fields discard_unknown and use_proto_names added to the protobuf processors.

Fixed

Bloblang error messages for bad function/method names or parameters should now be improved in mappings that use shorthand for root = ....
All redis components now support usernames within the configured URL for authentication.
The protobuf processor now supports targetting nested types from proto files.
The schema_registry_encode and schema_registry_decode processors should no longer double escape URL unsafe characters within subjects when querying their latest versions.

The full change log can be found here.

`v4.23.0`

Compare Source

For installation instructions check out the getting started guide.

Added

The amqp_0_9 output now supports dynamic interpolation functions within the exchange field.
Field custom_topic_creation added to the kafka output.
New Bloblang method ts_sub.
The Bloblang method abs now supports integers in and integers out.
Experimental extract_tracing_map field added to the nats, nats_jetstream and nats_stream inputs.
Experimental inject_tracing_map field added to the nats, nats_jetstream and nats_stream outputs.
New _fail_fast variants for the broker output fan_out and fan_out_sequential patterns.
Field summary_quantiles_objectives added to the prometheus metrics exporter.
The metric processor now supports floating point values for counter_by and gauge types.

Fixed

Allow labels on caches and rate limit resources when writing configs in CUE.
Go API: log/slog loggers injected into a stream builder via StreamBuilder.SetLogger should now respect formatting strings.
All Azure components now support container SAS tokens for authentication.
The kafka_franz input now provides properly typed metadata values.
The trino driver for the various sql_* components no longer panics when trying to insert nulls.
The http_client input no longer sends a phantom request body on subsequent requests when an empty payload is specified.
The schema_registry_encode and schema_registry_decode processors should no longer fail to obtain schemas containing slashes (or other URL path unfriendly characters).
The parse_log processor no longer extracts structured fields that are incompatible with Bloblang mappings.
Fixed occurrences where Bloblang would fail to recognise float32 values.

The full change log can be found here.

`v4.22.0`

Compare Source

For installation instructions check out the getting started guide.

Added

The -e/--env-file cli flag for importing environment variable files now supports glob patterns.
Environment variables imported via -e/--env-file cli flags now support triple quoted strings.
New experimental counter function added to Bloblang. It is recommended that this function, although experimental, should be used instead of the now deprecated count function.
The schema_registry_encode and schema_registry_decode processors now support JSONSchema.
Field metadata added to the nats and nats_jetstream outputs.
The cached processor field ttl now supports interpolation functions.
Many new properties fields have been added to the amqp_0_9 output.
Field command added to the redis_list input and output.

Fixed

Corrected a scheduling error where the generate input with a descriptor interval (@hourly, etc) had a chance of firing twice.
Fixed an issue where a redis_streams input that is rejected from read attempts enters a reconnect loop without backoff.
The sqs input now periodically refreshes the visibility timeout of messages that take a significant amount of time to process.
The ts_add_iso8601 and ts_sub_iso8601 bloblang methods now return the correct error for certain invalid durations.
The discord output no longer ignores structured message fields containing underscores.
Fixed an issue where the kafka_franz input was ignoring batching periods and stalling.

Changed

The random_int Bloblang function now prevents instantiations where either the max or min arguments are dynamic. This is in order to avoid situations where the random number generator is re-initialised across subsequent mappings in a way that surprises map authors.

The full change log can be found here.

`v4.21.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Fields client_id and rack_id added to the kafka_franz input and output.
New experimental command processor.
Parameter no_cache added to the file and env Bloblang functions.
New file_rel function added to Bloblang.
Field endpoint_params added to the oauth2 section of HTTP client components.

Fixed

Allow comments in single root and directly imported bloblang mappings.
The azure_blob_storage input no longer adds blob_storage_content_type and blob_storage_content_encoding metadata values as string pointer types, and instead adds these values as string types only when they are present.
The http_server input now returns a more appropriate 503 service unavailable status code during shutdown instead of the previous 404 status.
Fixed a potential panic when closing a pusher output that was never initialised.
The sftp output now reconnects upon being disconnected by the Azure idle timeout.
The switch output now produces error logs when messages do not pass at least one case with strict_mode enabled, previously these rejected messages were potentially re-processed in a loop without any logs depending on the config. An inaccuracy to the documentation has also been fixed in order to clarify behaviour when strict mode is not enabled.
The log processor fields_mapping field should no longer reject metadata queries using @ syntax.
Fixed an issue where heavily utilised streams with nested resource based outputs could lock-up when performing heavy resource mutating traffic on the streams mode REST API.
The Bloblang zip method no longer produces values that yield an "Unknown data type".

The full change log can be found here.

`v4.20.0`

Compare Source

For installation instructions check out the getting started guide.

Added

The amqp1 input now supports anonymous SASL authentication.
New JWT Bloblang methods parse_jwt_es256, parse_jwt_es384, parse_jwt_es512, parse_jwt_rs256, parse_jwt_rs384, parse_jwt_rs512, sign_jwt_es256, sign_jwt_es384 and sign_jwt_es512 added.
The csv-safe input codec now supports custom delimiters with the syntax csv-safe:x.
The open_telemetry_collector tracer now supports secure connections, enabled via the secure field.
Function v0_msg_exists_meta added to the javascript processor.

Fixed

Fixed an issue where saturated output resources could panic under intense CRUD activity.
The config linter no longer raises issues with codec fields containing colons within their arguments.
The elasticsearch output should no longer fail to send basic authentication passwords, this fixes a regression introduced in v4.19.0.

The full change log can be found here.

`v4.19.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Field topics_pattern added to the pulsar input.
Both the schema_registry_encode and schema_registry_decode processors now support protobuf schemas.
Both the schema_registry_encode and schema_registry_decode processors now support references for AVRO and PROTOBUF schemas.
New Bloblang method zip.
New Bloblang int8, int16, uint8, uint16, float32 and float64 methods.

Fixed

Errors encountered by the gcp_pubsub output should now present more specific logs.
Upgraded kafka input and output underlying sarama client library to v1.40.0 at new module path github.com/IBM/sarama
The CUE schema for switch processor now correctly reflects that it takes a list of clauses.
Fixed the CUE schema for fields that take a 2d-array such as workflow.order.
The snowflake_put output has been added back to 32-bit ARM builds since the build incompatibilities have been resolved.
The snowflake_put output and the sql_* components no longer trigger a panic when running on a readonly file system with the snowflake driver. This driver still requires access to write temporary files somewhere, which can be configured via the Go TMPDIR environment variable. Details here.
The http_server input and output now follow the same multiplexer rules regardless of whether the general http server block is used or a custom endpoint.
Config linting should now respect fields sourced via a merge key (<<).
The lint subcommand should now lint config files pointed to via -r/--resources flags.

Changed

The snowflake_put output is now beta.
Endpoints specified by http_server components using both the general http server block or their own custom server addresses should no longer be treated as path prefixes unless the path ends with a slash (/), in which case all extensions of the path will match. This corrects a behavioural change introduced in v4.14.0.

The full change log can be found here.

`v4.18.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Field logger.level_name added for customising the name of log levels in the JSON format.
Methods sign_jwt_rs256, sign_jwt_rs384 and sign_jwt_rs512 added to Bloblang.

Fixed

HTTP components no longer ignore proxy_url settings when OAuth2 is set.
The PATCH verb for the streams mode REST API no longer fails to patch over newer components implemented with the latest plugin APIs.
The nats_jetstream input no longer fails for configs that set bind to true and do not specify both a stream and durable together.
The mongodb processor and output no longer ignores the upsert field.

Changed

The old parquet processor (now superseded by parquet_encode and parquet_decode) has been removed from 32-bit ARM builds due to build incompatibilities.
The snowflake_put output has been removed from 32-bit ARM builds due to build incompatibilities.
Plugin API: The (*BatchError).WalkMessages method has been deprecated in favour of WalkMessagesIndexedBy.

The full change log can be found here.

`v4.17.0`

Compare Source

For installation instructions check out the getting started guide.

Added

The dynamic input and output have a new endpoint /input/{id}/uptime and /output/{id}/uptime respectively for obtaining the uptime of a given input/output.
Field wait_time_seconds added to the aws_sqs input.
Field timeout added to the gcp_cloud_storage output.
All NATS components now set the name of each connection to the component label when specified.

Fixed

Restore message ordering support to gcp_pubsub output. This issue was introduced in 4.16.0 as a result of #1836.
Specifying structured metadata values (non-strings) in unit test definitions should no longer cause linting errors.

Changed

The nats input default value of prefetch_count has been increased from 32 to a more appropriate 524288.

The full change log can be found here.

`v4.16.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Fields auth.user_jwt and auth.user_nkey_seed added to all NATS components.
bloblang: added ulid(encoding, random_source) function to generate Universally Unique Lexicographically Sortable Identifiers (ULIDs).
Field skip_on added to the cached processor.
Field nak_delay added to the nats input.
New splunk_hec output.
Plugin API: New NewMetadataExcludeFilterField function and accompanying FieldMetadataExcludeFilter method added.
The pulsar input and output are now included in the main distribution of Benthos again.
The gcp_pubsub input now adds the metadata field gcp_pubsub_delivery_attempt to messages when dead lettering is enabled.
The aws_s3 input now adds s3_version_id metadata to versioned messages.
All compress/decompress components (codecs, bloblang methods, processors) now support pgzip.
Field connection.max_retries added to the websocket input.
New sentry_capture processor.

Fixed

The open_telemetry_collector tracer option no longer blocks service start up when the endpoints cannot be reached, and instead manages connections in the background.
The gcp_pubsub output should see significant performance improvements due to a client library upgrade.
The stream builder APIs should now follow logger.file config fields.
The experimental cue format in the cli list subcommand no longer introduces infinite recursion for #Processors.
Config unit tests no longer execute linting rules for missing env var interpolations.

The full change log can be found here.

`v4.15.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Flag --skip-env-var-check added to the lint subcommand, this disables the new linting behaviour where environment variable interpolations without defaults throw linting errors when the variable is not defined.
The kafka_franz input now supports explicit partitions in the field topics.
The kafka_franz input now supports batching.
New metadata Bloblang function for batch-aware structured metadata queries.
Go API: Running the Benthos CLI with a context set with a deadline now triggers graceful termination before the deadline is reached.
Go API: New public/service/servicetest package added for functions useful for testing custom Benthos builds.
New lru and ttlru in-memory caches.

Fixed

Provide msgpack plugins through public/components/msgpack.
The kafka_franz input should no longer commit offsets one behind the next during partition yielding.
The streams mode HTTP API should no longer route requests to /streams/<stream-ID> to the /streams handler. This issue was introduced in v4.14.0.

The full change log can be found here.

`v4.14.0`

Compare Source

For installation instructions check out the getting started guide.

Added

The -e/--env-file cli flag can now be specified multiple times.
New studio pull cli subcommand for running Benthos Studio session deployments.
Metadata field kafka_tombstone_message added to the kafka and kafka_franz inputs.
Method SetEnvVarLookupFunc added to the stream builder API.
The discord input and output now use the official chat client API and no longer rely on poll-based HTTP requests, this should result in more efficient and less erroneous behaviour.
New bloblang timestamp methods ts_add_iso8601 and ts_sub_iso8601.
All SQL components now support the trino driver.
New input codec csv-safe.
Added base64rawurl scheme to both the encode and decode Bloblang methods.
New find_by and find_all_by Bloblang methods.
New skipbom input codec.
New javascript processor.

Fixed

The find_all bloblang method no longer produces results that are of an unknown type.
The find_all and find Bloblang methods no longer fail when the value argument is a field reference.
Endpoints specified by HTTP server components using both the general http server block or their own custom server addresses should now be treated as path prefixes. This corrects a behavioural change that was introduced when both respective server options were updated to support path parameters.
Prevented a panic caused when using the encrypt_aes and decrypt_aes Bloblang methods with a mismatched key/iv lengths.
The snowpipe field of the snowflake_put output can now be omitted from the config without raising an error.
Batch-aware processors such as mapping and mutation should now report correct error metrics.
Running benthos blobl server should no longer panic when a mapping with variable read/writes is executed in parallel.
Speculative fix for the cloudwatch metrics exporter rejecting metrics due to minimum field size of 1, PutMetricDataInput.MetricData[0].Dimensions[0].Value.
The snowflake_put output now prevents silent failures under certain conditions. Details here.
Reduced the amount of pre-compilation of Bloblang based linting rules for documentation fields, this should dramatically improve the start up time of Benthos (~1s down to ~200ms).
Environment variable interpolations with an empty fallback (${FOO:}) are now valid.
Fixed an issue where the mongodb output wasn't using bulk send requests according to batching policies.
The amqp_1 input now falls back to accessing Message.Value when the data is empty.

Changed

When a config contains environment variable interpolations without a default value (i.e. ${FOO}), if that environment variable is not defined a linting error will be emitted. Shutting down due to linting errors can be disabled with the --chilled cli flag, and variables can be specified with an empty default value (${FOO:}) in order to make the previous behaviour explicit and prevent the new linting error.
The find and find_all Bloblang methods no longer support query arguments as they were incompatible with supporting value arguments. For query based arguments use the new find_by and find_all_by methods.

The full change log can be found here.

`v4.13.0`

Compare Source

For installation instructions check out the getting started guide.

Added

New nats_kv processor, input and output.
Field partition added to the kafka_franz output, allowing for manual partitioning.

Fixed

The broker output with the pattern fan_out_sequential will no longer abandon in-flight requests that are error blocked until the full shutdown timeout has occurred.
The broker input no longer reports itself as unavailable when a child input has intentionally closed.
Config unit tests that check for structured data should no longer fail in all cases.
The http_server input with a custom address now supports path variables.

The full change log can be found here.

`v4.12.1`

Compare Source

For installation instructions check out the getting started guide.

Fixed

Fixed a regression bug in the nats components where panics occur during a flood of messages. This issue was introduced in v4.12.0 (45f785a).

The full change log can be found here.

`v4.12.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Format csv:x added to the unarchive processor.
Field max_buffer added to the aws_s3 input.
Field open_message_type added to the websocket input.
The experimental --watcher cli flag now takes into account file deletions and new files that match wildcard patterns.
Field dump_request_log_level added to HTTP components.
New couchbase cache implementation.
New compress and decompress Bloblang methods.
Field endpoint added to the gcp_pubsub input and output.
Fields file_name, file_extension and request_id added to the snowflake_put output.
Add interpolation support to the path field of the snowflake_put output.
Add ZSTD compression support to the compression field of the snowflake_put output.
New Bloblang method concat.
New redis ratelimit.
The socket_server input now supports tls as a network type.
New bloblang function timestamp_unix_milli.
New bloblang method ts_unix_milli.
JWT based HTTP authentication now supports EdDSA.
New flow_control fields added to the gcp_pubsub output.
Added bloblang methods sign_jwt_hs256, sign_jwt_hs384 and sign_jwt_hs512
New bloblang methods parse_jwt_hs256, parse_jwt_hs384, parse_jwt_hs512.
The open_telemetry_collector tracer now automatically sets the service.name and service.version tags if they are not configured by the user.
New bloblang string methods trim_prefix and trim_suffix.

Fixed

Fixed an issue where messages caught in a retry loop from inputs that do not support nacks (generate, kafka, file, etc) could be retried in their post-mutation form from the switch output rather than the original copy of the message.
The sqlite buffer should no longer print Failed to ack buffer message logs during graceful termination.
The default value of the conn_max_idle field has been changed from 0 to 2 for all sql_* components in accordance to the database/sql docs.
The parse_csv bloblang method with parse_header_row set to false no longer produces rows that are of an unknown type.
Fixed a bug where the oracle driver for the sql_* components was returning timestamps which were getting marshalled into an empty JSON object instead of a string.
The aws_sqs input no longer backs off on subsequent empty requests when long polling is enabled.
It's now possible to mock resources within the main test target file in config unit tests.
Unit test linting no longer incorrectly expects the json_contains predicate to contain a string value only.
Config component initialisation errors should no longer show nested path annotations.
Prevented panics from the jq processor when querying invalid types.
The jaeger tracer no longer emits the service.version tag automatically if the user sets the service.name tag explicitly.
The int64(), int32(), uint64() and uint32() bloblang methods can now infer the number base as documented here.
The mapping and mutation processors should provide metrics and tracing events again.
Fixed a data race in the redis_streams input.
Upgraded the Redis components to github.com/redis/go-redis/v9.

The full change log can be found here.

`v4.11.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Field default_encoding added to the parquet_encode processor.
Field client_session_keep_alive added to the snowflake_put output.
Bloblang now supports metadata access via @foo syntax, which also supports arbitrary values.
TLS client certs now support both PKCS#1 and PKCS#8 encrypted keys.
New redis_script processor.
New wasm processor.
Fields marked as secrets will no longer be printed with benthos echo or debug HTTP endpoints.
Add no_indent parameter to the format_json bloblang method.
New format_xml bloblang method.
New batched higher level input type.
The gcp_pubsub input now supports optionally creating subscriptions.
New sqlite buffer.
Bloblang now has int64, int32, uint64 and uint32 methods for casting explicit integer types.
Field application_properties_map added to the amqp1 output.
Param parse_header_row, delimiter and lazy_quotes added to the parse_csv bloblang method.
Field delete_on_finish added to the csv input.
Metadata fields header, path, mod_time_unix and mod_time added to the csv input.
New couchbase processor.
Field max_attempts added to the nsq input.
Messages consumed by the nsq input are now enriched with metadata.
New Bloblang method parse_url.

Fixed

Fixed a regression bug in the mongodb processor where message errors were not set any more. This issue was introduced in v4.7.0 (64eb72).
The avro-ocf:marshaler=json input codec now omits unexpected logical type fields.
Fixed a bug in the sql_insert output (see commit c6a71e9) where transaction-based drivers (clickhouse and oracle) would fail to roll back an in-progress transaction if any of the messages caused an error.
The resource input should no longer block the first layer of graceful termination.

Changed

The catch method now defines the context of argument mappings to be the string of the caught error. In previous cases the context was undocumented, vague and would often bind to the outer context. It's still possible to reference this outer context by capturing the error (e.g. .catch(_ -> this)).
Field interpolations that fail due to mapping errors will no longer produce placeholder values and will instead provide proper errors that result in nacks or retries similar to other issues.

The full change log can be found here.

`v4.10.0`

Compare Source

For installation instructions check out the getting started guide.

Added

The nats_jetstream input now adds a range of useful metadata information to messages.
Field transaction_type added to the azure_table_storage output, which deprecates the previous insert_type field and supports interpolation functions.
Field logged_batch added to the cassandra output.
All sql components now support Snowflake.
New azure_table_storage input.
New sql_raw input.
New tracing_id bloblang function.
New with bloblang method.
Field multi_header added to the kafka and kafka_franz inputs.
New cassandra input.
New base64_encode and base64_decode functions for the awk processor.
Param use_number added to the parse_json bloblang method.
Fields init_statement and init_files added to all sql components.
New find and find_all bloblang array methods.

Fixed

The gcp_cloud_storage output no longer ignores errors when closing a written file, this was masking issues when the target bucket was invalid.
Upgraded the kafka_franz input and output to use github.com/twmb/franz-go@v1.9.0 since some bug fixes were made recently.
Fixed an issue where a read_until child input with processors affiliated would block graceful termination.
The --labels linting option no longer flags resource components.

The full change log can be found here.

`v4.9.1`

Compare Source

For installation instructions check out the getting started guide.

Added

Go API: A new BatchError type added for distinguishing errors of a given batch.

Fixed

Rolled back kafka input and output underlying sarama client library to fix a regression introduced in 4.9.0 😅 where invalid configuration (Consumer.Group.Rebalance.GroupStrategies and Consumer.Group.Rebalance.Strategy cannot be set at the same time) errors would prevent consumption under certain configurations. We've decided to roll back rather than upgrade as a breaking API change was introduced that could cause issues for Go API importers (more info here: https://github.com/Shopify/sarama/issues/2358).

The full change log can be found here.

`v4.9.0`

Compare Source

For installation instructions check out the getting started guide.

Added

New parquet input for reading a batch of Parquet files from disk.
Field max_in_flight added to the redis_list input.

Fixed

Upgraded kafka input and output underlying sarama client library to fix a regression introduced in 4.7.0 where The requested offset is outside the range of offsets maintained by the server for the given topic/partition errors would prevent consumption of partitions.
The cassandra output now inserts logged batches of data rather than the less efficient (and unnecessary) unlogged form.

The full change log can be found here.

`v4.8.0`

Compare Source

For installation instructions check out the getting started guide.

Added

All sql components now support Oracle DB.

Fixed

All SQL components now accept an empty or unspecified args_mapping as an alias for no arguments.
Field unsafe_dynamic_query added to the sql_raw output.
Fixed a regression in 4.7.0 where HTTP client components were sending duplicate request headers.

The full change log can be found here.

`v4.7.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Field avro_raw_json added to the schema_registry_decode processor.
Field priority added to the gcp_bigquery_select input.
The hash bloblang method now supports crc32.
New tracing_span bloblang function.
All sql components now support SQLite.
New beanstalkd input and output.
Field json_marshal_mode added to the mongodb input.
The schema_registry_encode and schema_registry_decode processors now support Basic, OAuth and JWT authentication.

Fixed

The streams mode /ready endpoint no longer returns status 503 for streams that gracefully finished.
The performance of the bloblang .explode method now scales linearly with the target size.
The influxdb and logger metrics outputs should no longer mix up tag names.
Fix a potential race condition in the read_until connect check on terminated input.
The parse_parquet bloblang method and parquet_decode processor now automatically parse BYTE_ARRAY values as strings when the logical type is UTF8.
The gcp_cloud_storage output now correctly cleans up temporary files on error conditions when the collision mode is set to append.

The full change log can be found here.

`v4.6.0`

Compare Source

For installation instructions check out the getting started guide.

Added

New squash bloblang method.
New top-level config field shutdown_delay for delaying graceful termination.
New snowflake_id bloblang function.
Field wait_time_seconds added to the aws_sqs input.
New json_path bloblang method.
New file_json_contains predicate for unit tests.
The parquet_encode processor now supports the UTF8 logical type for columns.

Fixed

The schema_registry_encode processor now correctly assumes Avro JSON encoded documents by default.
The redis processor retry_period no longer shows linting errors for duration strings.
The /inputs and /outputs endpoints for dynamic inputs and outputs now correctly render configs, both structured within the JSON response and the raw config string.
Go API: The stream builder no longer ignores http configuration. Instead, the value of http.enabled is set to false by default.

The full change log can be found here.

`v4.5.1`

Compare Source

For installation instructions check out the getting started guide.

Fixed

Reverted kafka_franz dependency back to 1.3.1 due to a regression in TLS/SASL commit retention.
Fixed an unintentional linting error when using interpolation functions in the elasticsearch outputs action field.

The full change log can be found here.

`v4.5.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Field batch_size added to the generate input.
The amqp_0_9 output now supports setting the timeout of publish.
New experimental input codec avro-ocf:marshaler=x.
New mapping and mutation processors.
New parse_form_url_encoded bloblang method.
The amqp_0_9 input now supports setting the auto-delete bit during queue declaration.
New open_telemetry_collector tracer.
The kafka_franz input and output now supports no-op SASL options with the mechanism none.
Field content_type added to the gcp_cloud_storage cache.

Fixed

The mongodb processor and output default write_concern.w_timeout empty value no longer causes configuration issues.
Field message_name added to the logger config.
The amqp_1 input and output should no longer spam logs with timeout errors during graceful termination.
Fixed a potential crash when the contains bloblang method was used to compare complex types.
Fixed an issue where the kafka_franz input or output wouldn't use TLS connections without custom certificate configuration.
Fixed structural cycle in the CUE representation of the retry output.
Tracing headers from HTTP requests to the http_server input are now correctly extracted.

Changed

The broker input no longer applies processors before batching as this was unintentional behaviour and counter to documentation. Users that rely on this behaviour are advised to place their pre-batching processors at the level of the child inputs of the broker.
The broker output no longer applies processors after batching as this was unintentional behaviour and counter to documentation. Users that rely on this behaviour are advised to place their post-batching processors at the level of the child outputs of the broker.

The full change log can be found here.

`v4.4.1`

Compare Source

For installation instructions check out the getting started guide.

Fixed

Fixed an issue where an http_server input or output would fail to register prometheus metrics when combined with other inputs/outputs.
Fixed an issue where the jaeger tracer was incapable of sending traces to agents outside of the default port.

The full change log can be found here.

`v4.4.0`

Compare Source

For installation instructions check out the getting started guide.

Added

The service-wide http config now supports basic authentication.
The elasticsearch output now supports upsert operations.
New fake bloblang function.
New parquet_encode and parquet_decode processors.
New parse_parquet bloblang method.
CLI flag --prefix-stream-endpoints added for disabling streams mode API prefixing.
Field timestamp_name added to the logger config.

The full change log can be found here.

`v4.3.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Timestamp Bloblang methods are now able to emit and process time.Time values.
New ts_tz method for switching the timezone of timestamp values.
The elasticsearch output field type now supports interpolation functions.
The redis processor has been reworked to be more generally useful, the old operator and key fields are now deprecated in favour of new command and args_mapping fields.
Go API: Added component bundle ./public/components/aws for all AWS components, including a RunLambda function.
New cached processor.
Go API: New APIs for registering both metrics exporters and open telemetry tracer plugins.
Go API: The stream builder API now supports configuring a tracer, and tracer configuration is now isolated to the stream being executed.
Go API: Plugin components can now access input and output resources.
The redis_streams output field stream field now supports interpolation functions.
The kafka_franz input and outputs now support AWS_MSK_IAM as a SASL mechanism.
New pusher output.
Field input_batches added to config unit tests for injecting a series of message batches.

Fixed

Corrected an issue where Prometheus metrics from batching at the buffer level would be skipped when combined with input/output level batching.
Go API: Fixed an issue where running the CLI API without importing a component package would result in template init crashing.
The http processor and http_client input and output no longer have default headers as part of their configuration. A Content-Type header will be added to requests with a default value of application/octet-stream when a message body is being sent and the configuration has not added one explicitly.
Logging in logfmt mode with add_timestamp enabled now works.

The full change log can be found here.

`v4.2.0`

Compare Source

For installation instructions check out the getting started guide.

Added

Field credentials.from_ec2_role added to all AWS based components.
The mongodb input now supports aggregation filters by setting the new operation field.
New gcp_cloudtrace tracer.
New slug bloblang string method.
The elasticsearch output now supports the create action.
Field tls.root_cas_file added to the pulsar input and output.
The fallback output now adds a metadata field fallback_error to messages when shifted.
New bloblang methods ts_round, ts_parse, ts_format, ts_strptime, ts_strftime, ts_unix and ts_unix_nano. Most are aliases of (now deprecated) time methods with timestamp_ prefixes.
Ability to write logs to a file (with optional rotation) instead of stdout.

Fixed

The default docker image no longer throws configuration errors when running streams mode without an explicit general config.
The field metrics.mapping now allows environment functions such as hostname and env.
Fixed a lock-up in the amqp_0_9 output caused when messages sent with the immediate or mandatory flags were rejected.
Fixed a race condition upon creating dynamic streams that self-terminate, this was causing panics in cases where the stream finishes immediately.

The full change log can be found here.

`v4.1.0`

Compare Source

For installation instructions check out the getting started guide.

Added

The nats_jetstream input now adds headers to messages as metadata.
Field headers added to the nats_jetstream output.
Field lazy_quotes added to the CSV input.

Fixed

Fixed an issue where resource and stream configs imported via wildcard pattern could not be live-reloaded with the watcher (-w) flag.
Bloblang comparisons between numerical values (including match expression patterns) no longer require coercion into explicit types.
Reintroduced basic metrics from the twitter and discord template based inputs.
Prevented a metrics label mismatch when running in streams mode with resources and prometheus metrics.
Label mismatches with the prometheus metric type now log errors and skip the metric without stopping the service.
Fixed a case where empty files consumed by the aws_s3 input would trigger early graceful termination.

The full change log can be found here.

`v4.0.0`

Compare Source

For installation instructions check out the getting started guide.

This is a major version release, for more information and guidance on how to migrate please refer to https://benthos.dev/docs/guides/migration/v4.

Added

In Bloblang it is now possible to reference the root of the document being created within a mapping query.
The nats_jetstream input now supports pull consumers.
Field max_number_of_messages added to the aws_sqs input.
Field file_output_path added to the prometheus metrics type.
Unit test definitions can now specify a label as a target_processors value.
New connection settings for all sql components.
New experimental snowflake_put output.
New experimental gcp_cloud_storage cache.
Field regexp_topics added to the kafka_franz input.
The hdfs output directory field now supports interpolation functions.
The cli list subcommand now supports a cue format.
Field jwt.headers added to all HTTP client components.
Output condition file_json_equals added to config unit test definitions.

Fixed

The sftp output no longer opens files in both read and write mode.
The aws_sqs input with reset_visibility set to false will no longer reset timeouts on pending messages during gracefully shutdown.
The schema_registry_decode processor now handles AVRO logical types correctly. Details in #1198 and #1161 and also in https://github.com/linkedin/goavro/issues/242.

Changed

All components, features and configuration fields that were marked as deprecated have been removed.
The pulsar input and output are no longer included in the default Benthos builds.
The field pipeline.threads field now defaults to -1, which automatically matches the host machine CPU count.
Old style interpolation functions (${!json:foo,1}) are removed in favour of the newer Bloblang syntax (${! json("foo") }).
The Bloblang functions meta, root_meta, error and env now return null when the target value does not exist.
The clickhouse SQL driver Data Source Name format parameters have been changed due to a client library update. This also means placeholders in sql_raw components should use dollar syntax.
Docker images no longer come with a default config that contains generated environment variables, use -s flag arguments instead.
All cache components have had their retry/backoff fields modified for consistency.
All cache components that support a general default TTL now have a field default_ttl with a duration string, replacing the previous field.
The http processor and http_client output now execute message batch requests as individual requests by default. This behaviour can be disabled by explicitly setting batch_as_multipart to true.
Outputs that traditionally wrote empty newlines at the end of batches with >1 message when using the lines codec (socket, stdout, file, sftp) no longer do this by default.
The switch output field retry_until_success now defaults to false.
All AWS components now have a default region field that is empty, allowing environment variables or profile values to be used by default.
Serverless distributions of Benthos (AWS lambda, etc) have had the default output config changed to reject messages when the processing fails, this should make it easier to handle errors from invocation.
The standard metrics emitted by Benthos have been largely simplified and improved, for more information check out the metrics page.
The default metrics type is now prometheus.
The http_server metrics type has been renamed to json_api.
The stdout metrics type has been renamed to logger.
The logger configuration section has been simplified, with logfmt being the new default format.
The logger field add_timestamp is now false by default.
Field parts has been removed from all processors.
Field max_in_flight has been removed from a range of output brokers as it no longer required.
The dedupe processor now acts upon individual messages by default, and the hash field has been removed.
The log processor now executes for each individual message of a batch.
The sleep processor now executes for each individual message of a batch.
Go API: Module name has changed to github.com/benthosdev/benthos/v4.
Go API: All packages within the lib directory have been removed in favour of the newer APIs within public.
Go API: Distributed tracing is now via the Open Telemetry client library.

The full change log can be found here.

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this MR and you won't be reminded about this update again.

If you want to rebase/retry this MR, check this box

This MR has been generated by Renovate Bot.

Edited Oct 17, 2024 by autocafe