Releases · apollographql/router

07 Jul 14:48

v1.23.0

13d5382

v1.23.0

🚀 Features

Add `--listen` to CLI args (PR #3296)

Adds --listen to CLI args, which allows the user to specify the address to listen on.
It can also be set via environment variable APOLLO_ROUTER_LISTEN_ADDRESS.

router --listen 0.0.0.0:4001

By @ptondereau and @BrynCooke in #3296

Move operation limits and parser limits to General Availability (PR #3356)

Operation Limits (a GraphOS Enterprise feature) and parser limits are now moving to General Availability, from Preview where they have been since Apollo Router 1.17.

For more information about launch stages, please see the documentation here: https://www.apollographql.com/docs/resources/product-launch-stages/

In addition to removing the preview_ prefix, the configuration section has been renamed to just limits to encapsulate operation, parser and request limits. (The request size limit is still experimental.) Existing configuration files will keep working as before, but with a warning output to the logs. To fix that warning, rename the configuration section like so:

-preview_operation_limits:
+limits:
   max_depth: 100
   max_height: 200
   max_aliases: 30
   max_root_fields: 20

By @SimonSapin in #3356

Add support for readiness/liveness checks (Issue #3233)

Kubernetes lifecycle interop has been improved by implementing liveliness and readiness checks.

Kubernetes considers a service is:

live - if it isn't deadlocked
ready - if it is able to start accepting traffic

(For more details: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/)

The existing health check didn't surface this information. Instead, it returns a payload which indicates if the router is "healthy" or not and it's always returning "UP" (hard-coded).

The router health check now exposes this information based in the following rules:

Live
- Is not in state Errored
- Health check enabled and responding
Ready
- Is running and accepting requests.
- Is Live

To maintain backwards compatibility; query parameters named "ready" and "live" have been added to our existing health endpoint. Both POST and GET are supported.

Sample queries:

curl -XPOST "http://localhost:8088/health?ready" OR curl  "http://localhost:8088/health?ready"
curl -XPOST "http://localhost:8088/health?live" OR curl "http://localhost:8088/health?live"

By @garypen in #3276

Include path to Rhai script in syntax error messages

Syntax errors in the main Rhai script will now include the path to the script in the error message.

By @dbanty in #3254

Experimental support for GraphQL validation in Rust

We are experimenting with a new GraphQL validation implementation written in Rust. The legacy implementation is part of the JavaScript query planner. This is part of a project to remove JavaScript from the Router to improve performance and memory behavior.

To opt in to the new validation implementation, set:

experimental_graphql_validation_mode: new

Or use both to run the implementations side by side and log a warning if there is a difference in results:

experimental_graphql_validation_mode: both

This is an experimental option while we are still finding edge cases in the new implementation, and will be removed once we have confidence that parity has been achieved.

By @goto-bus-stop in #3134

Add environment variable access to rhai (Issue #1744)

This introduces support for accessing environment variable within Rhai. The new env module contains one function and is imported by default:
By @garypen in #3240

Add support for getting request method in Rhai (Issue #2467)

This adds support for getting the HTTP method of requests in Rhai.

fn process_request(request) {
    if request.method == "OPTIONS"  {
        request.headers["x-custom-header"] = "value"
    }
}

By @garypen in #3355

Add additional build functionality to the diy build script (Issue #3303)

The diy build script is useful for ad-hoc image creation during testing or for building your own images based on a router repo. This set of enhancements makes it possible to

build docker images from arbitrary (nightly) builds (-a)
build an amd64 docker image on an arm64 machine (or vice versa) (-m)
change the name of the image from the default 'router' (-n)

Note: the build machine image architecture is used if the -m flag is not supplied.

By @garypen in #3304

🐛 Fixes

Bring root span name in line with otel semantic conventions. (Issue #3229)

Root span name has changed from request to <graphql.operation.kind> <graphql.operation.name>

Open Telemetry graphql semantic conventions specify that the root span name must match the operation kind and name.

Many tracing providers don't have good support for filtering traces via attribute, so changing this significantly enhances the tracing experience.

By @BrynCooke in #3364

An APQ query with a mismatched hash will error as HTTP 400 (Issue #2948)

We now have the same behavior in the Gateway and the Router implementation. Even if our previous behavior was still acceptable, any other behavior is a misconfiguration of a client and should be prevented early.

Previously, if a client sent an operation with an APQ hash, we would merely log an error to the console, not register the operation (for the next request) but still execute the query. We now return a GraphQL error and don't execute the query. No clients should be impacted by this, though anyone who had hand-crafted a query with APQ information (for example, copied a previous APQ-registration query but only changed the operation without re-calculating the SHA-256) might now be forced to use the correct hash (or more practically, remove the hash).

By @o0Ignition0o in #3128

fix(subscription): take the callback url path from the configuration (Issue #3361)

Previously when you specified the subscription.mode.callback.path it was not used, we had an hardcoded value set to /callback. It's now using the specified path in the configuration

By @bnjjj in #3366

Preserve all shutdown receivers across reloads (Issue #3139)

We keep a list of all active requests and process all of them during shutdown. This will avoid prematurely terminating connections down when:

some requests are in flight
the router reloads (new schema, etc)
the router gets a shutdown signal

By @garypen in #3311

Enable serde_json float_roundtrip feature (Issue #2951)

The Router now preserves JSON floating point numbers exactly as they are received by enabling the serde_json float_roudtrip feature:

Use sufficient precision when parsing fixed precision floats from JSON to ensure that they maintain accuracy when round-tripped through JSON. This comes at an approximately 2x performance cost for parsing floats compared to the default best-effort precision.

By @garypen in #3338

Fix deferred response formatting when filtering queries (PR #3298, Issue #3263, PR #3339)

Filtering queries requires two levels of response formatting, and its implementation highlighted issues with deferred responses. Response formatting needs to recognize which deferred fragment generated it, and that the deferred response shapes can change depending on request variables, due to the @defer directive's if argument.

For now, this is solved by generating the response shapes for primary and deferred responses, for each combination of the variables used in @defer applications, limited to 32 unique variables. There will be follow up work with another approach that removes this limitation.

By @Geal and @SimonSapin in #3298, #3263 and #3339

...

Contributors

garypen, Geal, and 8 other contributors

Assets 9

05 Jul 12:10

apollo-bot2

v1.23.0-alpha.0

9c82318

v1.23.0-alpha.0 Pre-release

Pre-release

1.23.0-alpha.0

Assets 9

21 Jun 10:15

apollo-bot2

v1.22.0

f9c48bc

v1.22.0

🚀 Features

Federated Subscriptions (PR #3285)

⚠️ This is an Enterprise feature of the Apollo Router. It requires an organization with a GraphOS Enterprise plan.

If your organization doesn't currently have an Enterprise plan, you can test out this functionality by signing up for a free Enterprise trial.

High-Level Overview

What are Federated Subscriptions?

This PR adds GraphQL subscription support to the Router for use with Federation. Clients can now use GraphQL subscriptions with the Router to receive realtime updates from a supergraph. With these changes, subscription operations are now a first-class supported feature of the Router and Federation, alongside queries and mutations.

Client to Router Communication

Apollo has designed and implemented a new open protocol for handling subscriptions called multipart subscriptions
With this new protocol clients can manage subscriptions with the Router over tried and true HTTP; WebSockets, SSE (server-sent events), etc. are not needed
All Apollo clients (Apollo Client web, Apollo Kotlin, Apollo iOS) have been updated to support multipart subscriptions, and can be used out of the box with little to no extra configuration
Subscription communication between clients and the Router must use the multipart subscription protocol, meaning only subscriptions over HTTP are supported at this time

Router to Subgraph Communication

The Router communicates with subscription enabled subgraphs using WebSockets
By default, the router sends subscription requests to subgraphs using the graphql-transport-ws protocol which is implemented in the graphql-ws library. You can also configure it to use the graphql-ws protocol which is implemented in the subscriptions-transport-ws library.
Subscription ready subgraphs can be introduced to Federation and the Router as is - no additional configuration is needed on the subgraph side

Subscription Execution

When the Router receives a GraphQL subscription request, the generated query plan will contain an initial subscription request to the subgraph that contributed the requested subscription root field.

For example, as a result of a client sending this subscription request to the Router:

subscription {
  reviewAdded {
    id
    body
    product {
      id
      name
      createdBy {
        name
      }
    }
  }
}

The router will send this request to the reviews subgraph:

subscription {
  reviewAdded {
    id
    body
    product {
      id
    }
  }
}

When the reviews subgraph receives new data from its underlying source event stream, that data is sent back to the Router. Once received, the Router continues following the determined query plan to fetch any additional required data from other subgraphs:

Example query sent to the products subgraph:

query ($representations: [_Any!]!) {
  _entities(representations: $representations) {
    ... on Product {
      name
      createdBy {
        __typename
        email
      }
    }
  }
}

Example query sent to the users subgraph:

query ($representations: [_Any!]!) {
  _entities(representations: $representations) {
    ... on User {
      name
    }
  }
}

When the Router finishes running the entire query plan, the data is merged back together and returned to the requesting client over HTTP (using the multipart subscriptions protocol).

Configuration

Here is a configuration example:

subscription:
  mode:
    passthrough:
      all: # The router uses these subscription settings UNLESS overridden per-subgraph
        path: /subscriptions # The path to use for subgraph subscription endpoints (Default: /ws)
      subgraphs: # Overrides subscription settings for individual subgraphs
        reviews: # Overrides settings for the 'reviews' subgraph
          path: /ws # Overrides '/subscriptions' defined above
          protocol: graphql_transport_ws # The WebSocket-based protocol to use for subscription communication (Default: graphql_ws)

Usage Reporting

Subscription use is tracked in the Router as follows:

Subscription registration: The initial subscription operation sent by a client to the Router that's responsible for starting a new subscription
Subscription notification: The resolution of the client subscription’s selection set in response to a subscription enabled subgraph source event

Subscription registration and notification (with operation traces and statistics) are sent to Apollo Studio for observability.

Advanced Features

This PR includes the following configurable performance optimizations.

Deduplication

If the Router detects that a client is using the same subscription as another client (ie. a subscription with the same HTTP headers and selection set), it will avoid starting a new subscription with the requested subgraph. The Router will reuse the same open subscription instead, and will send the same source events to the new client.
This helps reduce the number of WebSockets that need to be opened between the Router and subscription enabled subgraphs, thereby drastically reducing Router to subgraph network traffic and overall latency
For example, if 100 clients are subscribed to the same subscription there will be 100 open HTTP connections from the clients to the Router, but only 1 open WebSocket connection from the Router to the subgraph
Subscription deduplication between the Router and subgraphs is enabled by default (but can be disabled via the Router config file)

Callback Mode

Instead of sending subscription data between a Router and subgraph over an open WebSocket, the Router can be configured to send the subgraph a callback URL that will then be used to receive all source stream events
Subscription enabled subgraphs send source stream events (subscription updates) back to the callback URL by making HTTP POST requests
Refer to the callback mode documentation for more details, including an explanation of the callback URL request/response payload format
This feature is still experimental and needs to be enabled explicitly in the Router config file

By @bnjjj and @o0Ignition0o in #3285

Contributors

bnjjj and o0Ignition0o

Assets 9

20 Jun 15:35

apollo-bot2

v1.22.0-alpha.0

7a5d03d

v1.22.0-alpha.0 Pre-release

Pre-release

1.22.0-alpha.0

Assets 9

20 Jun 10:46

apollo-bot2

v1.21.0

065a6a6

v1.21.0

🚀 Features

Restore HTTP payload size limit, make it configurable (Issue #2000)

Early versions of Apollo Router used to rely on a part of the Axum web framework
that imposed a 2 MB limit on the size of the HTTP request body.
Version 1.7 changed to read the body directly, unintentionally removing this limit.

The limit is now restored to help protect against unbounded memory usage, but is now configurable:

preview_operation_limits:
  experimental_http_max_request_bytes: 2000000 # Default value: 2 MB

This limit is checked while reading from the network, before JSON parsing.
Both the GraphQL document and associated variables count toward it.

Before increasing this limit significantly consider testing performance
in an environment similar to your production, especially if some clients are untrusted.
Many concurrent large requests could cause the Router to run out of memory.

By @SimonSapin in #3130

Add support for empty auth prefixes (Issue #2909)

The authentication.jwt plugin now supports empty prefixes for the JWT header. Some companies use prefix-less headers; previously, the authentication plugin rejected requests even with an empty header explicitly set, such as:

authentication:
  jwt:
    header_value_prefix: ""

By @lleadbet in #3206

🐛 Fixes

GraphQL introspection errors are now 400 errors (Issue #3090)

If we get an introspection error during SupergraphService::plan_query(), then it is reported to the client as an HTTP 500 error. The Router now generates a valid GraphQL error for introspection errors whilst also modifying the HTTP status to be 400.

Before:

StatusCode:500

{"errors":[{"message":"value retrieval failed: introspection error: introspection error : Field \"__schema\" of type \"__Schema!\" must have a selection of subfields. Did you mean \"__schema { ... }\"?","extensions":{"code":"INTERNAL_SERVER_ERROR"}}]}

After:

StatusCode:400

{"errors":[{"message":"introspection error : Field \"__schema\" of type \"__Schema!\" must have a selection of subfields. Did you mean \"__schema { ... }\"?","extensions":{"code":"INTROSPECTION_ERROR"}}]}

By @garypen in #3122

Restore missing debug tools in "debug" Docker images (Issue #3249)

Debug Docker images were designed to make use of heaptrack for debugging memory issues. However, this functionality was inadvertently removed when we changed to multi-architecture Docker image builds.

heaptrack functionality is now restored to our debug docker images.

By @garypen in #3250

Federation v2.4.8 (Issue #3217, Issue #3227)

This release bumps the Router's Federation support from v2.4.7 to v2.4.8, which brings in notable query planner fixes from v2.4.8. Of note from those releases, this brings query planner fixes that (per that dependency's changelog):

Fix bug in the handling of dependencies of subgraph fetches. This bug was manifesting itself as an assertion error (apollographql/federation#2622)
thrown during query planning with a message of the form Root groups X should have no remaining groups unhandled (...).
Fix issues in code to reuse named fragments. One of the fixed issue would manifest as an assertion error with a message (apollographql/federation#2619)
looking like Cannot add fragment of condition X (...) to parent type Y (...). Another would manifest itself by
generating an invalid subgraph fetch where a field conflicts with another version of that field that is in a reused
named fragment.

These manifested as Router issues #3217 and #3227.

By @renovate and o0ignition0o in #3202

update Rhai to 1.15.0 to fix issue with hanging example test (Issue #3213)

One of our Rhai examples' tests have been regularly hanging in the CI builds. Investigation uncovered a race condition within Rhai itself. This update brings in the fixed version of Rhai and should eliminate the hanging problem and improve build stability.

By @garypen in #3273

🛠 Maintenance

chore: split out router events into its own module (PR #3235)

Breaks down ./apollo-router/src/router.rs into its own module ./apollo-router/src/router/mod.rs with a sub-module ./apollo-router/src/router/event/mod.rs that contains all the streams that we combine to start a router (entitlement, schema, reload, configuration, shutdown, more streams to be added).

By @EverlastingBugstopper in #3235

Simplify router service tests (PR #3259)

Parts of the router service creation were generic, to allow mocking, but the TestHarness API allows us to reuse the same code in all cases. Generic types have been removed to simplify the API.

By @Geal in #3259

📚 Documentation

Improve example Rhai scripts for JWT Authentication (PR #3184)

Simplify the example Rhai scripts in the JWT Authentication docs and includes a sample main.rhai file to make it clear how to use all scripts together.

By @dbanty in #3184

🧪 Experimental

Expose the apollo compiler at the supergraph service level (internal) (PR #3200)

Add a query analysis phase inside the router service, before sending the query through the supergraph plugins. It makes a compiler available to supergraph plugins, to perform deeper analysis of the query. That compiler is then used in the query planner to create the Query object containing selections for response formatting.

This is for internal use only for now, and the APIs are not considered stable.

By @o0Ignition0o and @Geal in #3200

Query planner plugins (internal) (Issue #3150)

Future functionality may need to modify a query between query plan caching and the query planner. This leads to the requirement to provide a query planner plugin capability.

Query planner plugin functionality exposes an ApolloCompiler instance to perform preprocessing of a query before sending it to the query planner.

This is for internal use only for now, and the APIs are not considered stable.

By @Geal in #3177 and #3252

Contributors

garypen, Geal, and 6 other contributors

Assets 9

16 Jun 13:34

apollo-bot2

v1.21.0-alpha.1

c709bb8

v1.21.0-alpha.1 Pre-release

Pre-release

1.21.0-alpha.1

Assets 9

08 Jun 10:58

apollo-bot2

v1.21.0-alpha.0

d189d46

v1.21.0-alpha.0 Pre-release

Pre-release

1.21.0-alpha.0

Assets 9

31 May 14:29

apollo-bot2

v1.20.0

20833d3

v1.20.0

🚀 Features

Configurable histogram buckets for metrics (Issue #2333)

It is now possible to change the default bucketing for histograms generated for metrics:

telemetry:
  metrics:
    common:
      buckets:
        - 0.05
        - 0.10
        - 0.25
        - 0.50
        - 1.00
        - 2.50
        - 5.00
        - 10.00
        - 20.00

By @bnjjj in #3098

🐛 Fixes

Federation v2.4.7 (Issue #3170, Issue #3133)

This release bumps the Router's Federation support from v2.4.6 to v2.4.7, which brings in notable query planner fixes from v2.4.7. Of note from those releases, this brings query planner fixes that (per that dependency's changelog):

Re-work the code use to try to reuse query named fragments to improve performance (thus sometimes improving query (#2604) planning performance)
Fix a raised assertion error (again, with a message of form like Cannot add selection of field X to selection set of parent type Y).
Fix a rare issue where an interface or union field was not being queried for all the types it should be.

By @Geal in #3185

Set the global allocator in the library crate, not just the executable (Issue #3126)

In 1.19, Apollo Router switched to use jemalloc as the global Rust allocator on Linux to reduce memory fragmentation. However, prior to this change this was only occurring in the executable binary provided by the apollo-router crate and custom binaries using the crate as a library were not getting this benefit.

The apollo-router library crate now sets the global allocator so that custom binaries also take advantage of this by default. If some other choice is desired, the global-allocator Cargo feature flag can be disabled in Cargo.toml with:

[dependencies]
apollo-router = {version = "[…]", default-features = false}

Library crates that depend on apollo-router (if any) should also do this in order to leave the choice to the eventual executable. (Cargo default features are only disabled if all dependents specify default-features = false.)

By @SimonSapin in #3157

Add `ca-certificates` to our Docker image (Issue #3173)

We removed curl from our Docker images to improve security, which meant that our implicit install of ca-certificates (as a dependency of curl) was no longer performed.

This fix reinstates the ca-certificates package explicitly, which is required for the router to be able to process TLS requests.

By @garypen in #3174

Helm: Running of `helm test` no longer fails

Running helm test was generating an error since wget was sending a request without a proper body and expecting an HTTP status response of 2xx. Without the proper body, it expectedly resulted in an HTTP status of 400. By switching to using netcat (or nc) we will now check that the port is up and use that to determine that the router is functional.

By @bbardawilwiser in #3096

Move `curl` dependency to separate layer in Docker image (Issue #3144)

We've moved curl out of the Docker image we publish. The curl command is only used in the image we produce today for the sake of downloading dependencies. It is never used after that, but we can move it to a separate layer to further remove it from the image.

By @abernix in #3146

🛠 Maintenance

Improve `cargo-about` license checking (Issue #3176)

From the description of this cargo about PR, it is possible for NOASSERTION identifiers to be added when gathering license information, causing license checks to fail. This change uses the new cargo-about configuration filter-noassertion to eliminate the problem.

By @garypen in #3178

Contributors

garypen, Geal, and 4 other contributors

Assets 9

26 May 13:26

apollo-bot2

v1.19.1

9cf863d

v1.19.1

🐛 Fixes

Fix router coprocessor deferred response buffering and change JSON body type from Object to String (Issue #3015)

The current implementation of the RouterResponse processing for coprocessors forces buffering of response data before passing the data to a coprocessor. This is a bug, because deferred responses should be processed progressively with a stream of calls to the coprocessor as each chunk of data becomes available.

Furthermore, the data type was assumed to be valid JSON for both RouterRequest and RouterResponse coprocessor processing. This is also a bug, because data at this stage of processing was never necessarily valid JSON. This is a particular issue when dealing with deferred (when using @defer) RouterResponses.

This change fixes both of these bugs by modifying the router so that coprocessors are invoked with a body payload which is a JSON String, not a JSON Object. Furthermore, the router now processes each chunk of response data separately so that a coprocessor will receive multiple calls (once for each chunk) for a deferred response.

For more details about how this works see the coprocessor documentation.

By @garypen in #3104

Experimental: Query plan cache keys now include a hash of the query and operation name (Issue #2998)

Note
This feature is still experimental and not recommended under normal use nor is it validated that caching query plans in a distributed fashion will result in improved performance.

The experimental feature for caching query plans in a distributed store (e.g., Redis) will now create a SHA-256 hash of the query and operation name and include that hash in the cache key, rather than using the operation document as it was previously.

By @Geal in #3101

Federation v2.4.6 (Issue #3133)

This release bumps the Router's Federation support from v2.4.5 to v2.4.6, which brings in notable query planner fixes from v2.4.6. Of note from those releases, this brings query planner fixes that (per that dependency's changelog):

Fix assertion error in some overlapping fragment cases. In some cases, when fragments overlaps on some sub-selections (apollographql/federation#2594) and some interface field implementation relied on sub-typing, an assertion error could be raised with a message of the form Cannot add selection of field X to selection set of parent type Y and this fixes this problem.
Fix possible fragment-related assertion error during query planning. This prevents a rare case where an assertion with a (apollographql/federation#2596) message of the form Cannot add fragment of condition X (runtimes: ...) to parent type Y (runtimes: ...) could fail during query planning.

In addition, the packaging includes dependency updates for bytes, regex, once_cell, tokio, and uuid.

By @Geal in #3135

Error redaction for subgraphs now respects disabling it

This follows-up on the new ability to selectively disable Studio-bound error redaction which was released in #3011 by fixing a bug which was preventing users from disabling that behavior on subgraphs. Redaction continues to be on by default and both the default behavior and the explicit redact: true option were behaving correctly.

With this fix, the tracing.apollo.errors.subgraph.all.redact option set to false will now transmit the un-redacted error message to Studio.

By @bnjjj in #3137

Evaluate multiple keys matching a JWT criteria (Issue #3017)

In some cases, multiple keys could match what a JWT asks for (both the algorithm, alg, and optional key identifier, kid). Previously, we scored each possible match and only took the one with the highest score. But even then, we could have multiple keys with the same score (e.g., colliding kid between multiple JWKS in tests).

The improved behavior will:

Return a list of those matching key instead of the one with the highest score.
Try them one by one until the JWT is validated, or return an error.
If some keys were found with the highest possible score (matching alg, with kid present and matching, too), then we only test those keys.

By @Geal in #3031

🛠 Maintenance

chore(deps): `xtask/` dependency updates (PR #3149)

This is effectively running cargo update in the xtask/ directory (our directory of tooling; not runtime components) to bring things more up to date.

This changeset takes extra care to update chrono's features to remove the time dependency which is impacted by CVE-2020-26235, resolving a moderate severity which was appearing in scans. Again, this is not a runtime dependency and there was no actual/known impact to any users.

By @abernix in #3149

Improve testability of the `state_machine` in integration tests

We have introduced a TestRouterHttpServer for writing more fine-grained integration tests in the Router core for the behaviors of the state machine.

By @o0Ignition0o in #3099

Contributors

garypen, Geal, and 3 other contributors

Assets 9

19 May 13:47

apollo-bot2

v1.19.0

1ca3b07

v1.19.0

Note
This release focused a notable amount of effort on improving both CPU usage and memory utilization/fragmentization. Our testing and pre-release feedback has been overwhelmingly positive. 🙌

🚀 Features

GraphOS Enterprise: `require_authentication` option to reject unauthenticated requests (Issue #2866)

While the authentication plugin validates queries with JWT, it does not reject unauthenticated requests, and leaves that to other layers. This allows co-processors to handle other authentication methods, and plugins at later layers to authorize the request or not. Typically, this was done in rhai.

This now adds an option to the Router's YAML configuration to reject unauthenticated requests. It can be used as follows:

authorization:
  require_authentication: true

The plugin will check for the presence of the apollo_authentication::JWT::claims key in the request context as proof that the request is authenticated.

By @Geal in #3002

🐛 Fixes

Prevent span attributes from being formatted to write logs

We do not show span attributes in our logs, but the log formatter still spends time formatting them to a string, even when there will be no logs written for the trace. This adds the NullFieldFormatter that entirely avoids formatting the attributes to improve performance.

By @Geal in #2890

Federation v2.4.5

This release bumps the Router's Federation support from v2.4.2 to v2.4.5, which brings in notable query planner fixes from v2.4.3 and v2.4.5. Federation v2.4.4 will not exist due to a publishing failure. Of note from those releases, this brings query planner fixes that:

Improves the heuristics used to try to reuse the query named fragments in subgraph fetches. Said fragment will be reused (apollographql/federation#2541) more often, which can lead to smaller subgraph queries (and hence overall faster processing).
Fix potential assertion error during query planning in some multi-field @requires case. This error could be triggered (#2575) when a field in a @requires depended on another field that was also part of that same requires (for instance, if a field has a @requires(fields: "id otherField") and that id is also a key necessary to reach the subgraph providing otherField).

The assertion error thrown in that case contained the message Root groups (...) should have no remaining groups unhandled (...)

By @abernix in #3107

Add support for throwing GraphQL errors in Rhai responses (Issue #3069)

It's possible to throw a GraphQL error from Rhai when processing a request. This extends the capability to include errors when processing a response.

Refer to the Terminating client requests section of the Rhai api documentation to learn how to throw GraphQL payloads.

By @garypen in #3089

Use a parking-lot mutex in `Context` to avoid contention (Issue #2751)

Request context requires synchronized access to the busy timer, and previously we used a futures aware mutex for that, but those are susceptible to contention. This replaces that mutex with a parking-lot synchronous mutex that is much faster.

By @Geal in #2885

Config and schema reloads now use async IO (Issue #2613)

If you were using local schema or config then previously the Router was performing blocking IO in an async thread. This could have caused stalls to serving requests.
The Router now uses async IO for all config and schema reloads.

Fixing the above surfaced an issue with the experimental force_hot_reload feature introduced for testing. This has also been fixed and renamed to force_reload.

experimental_chaos:
-    force_hot_reload: 1m
+    force_reload: 1m

By @BrynCooke in #3016

Improve subgraph coprocessor context processing (Issue #3058)

Each call to a subgraph co-processor could update the entire request context as a single operation. This is racy and could lead to difficult to predict context modifications depending on the order in which subgraph requests and responses are processed by the router.

This fix modifies the router so that subgraph co-processor context updates are merged within the existing context. This is still racy, but means that subgraphs are only racing to perform updates at the context key level, rather than across the entire context.

Future enhancements will provide a more comprehensive mechanism that will support some form of sequencing or change arbitration across subgraphs.

By @garypen in #3054

🛠 Maintenance

Add private component to the `Context` structure (Issue #2800)

There's a cost in using the Context structure during a request's lifecycle, due to JSON serialization and deserialization incurred when doing inter-plugin communication (e.g., between Rhai/coprocessors and Rust). For internal router usage, we now use a more efficient structure that avoids serialization costs of our private contextual properties which do not need to be exposed to plugins.

By @Geal in #2802

Adds an integration test for all YAML configuration files in `./examples` (Issue #2932)

Adds an integration test that iterates over ./examples looking for .yaml files that don't have a Cargo.toml or .skipconfigvalidation sibling file, and then running setup_router_and_registry on them, fast failing on any errors along the way.

By @EverlastingBugstopper in #3097

Improve memory fragmentation and resource consumption by switching to `jemalloc` as the memory allocator on Linux (PR #2882)

Detailed memory investigation revealed significant memory fragmentation when using the default allocator, glibc, on Linux. Performance testing and flame-graph analysis suggested that using jemalloc on Linux would yield notable performance improvements. In our tests, this figure shows performance to be about 35% faster than the default allocator, on account of spending less time managing memory fragmentation.

Not everyone will see a 35% performance improvement. Depending on your usage patterns, you may see more or less than this. If you see a regression, please file an issue with details.

We have no reason to believe that there are allocation problems on other platforms, so this change is confined to Linux.

By @garypen in #2882

Improve performance by avoiding temporary allocations creating response paths (PR #2854)

Response formatting generated many temporary allocations while creating response paths. By making a reference based type to hold these paths, we can prevent those allocations and improve performance.

By @Geal in #2854

Contributors

garypen, Geal, and 3 other contributors

Assets 9

Releases: apollographql/router

v1.23.0

🚀 Features

Add --listen to CLI args (PR #3296)

Move operation limits and parser limits to General Availability (PR #3356)

Add support for readiness/liveness checks (Issue #3233)

Include path to Rhai script in syntax error messages

Experimental support for GraphQL validation in Rust

Add environment variable access to rhai (Issue #1744)

Add support for getting request method in Rhai (Issue #2467)

Add additional build functionality to the diy build script (Issue #3303)

🐛 Fixes

Bring root span name in line with otel semantic conventions. (Issue #3229)

An APQ query with a mismatched hash will error as HTTP 400 (Issue #2948)

fix(subscription): take the callback url path from the configuration (Issue #3361)

Preserve all shutdown receivers across reloads (Issue #3139)

Enable serde_json float_roundtrip feature (Issue #2951)

Fix deferred response formatting when filtering queries (PR #3298, Issue #3263, PR #3339)

...

Contributors

v1.23.0-alpha.0

v1.22.0

🚀 Features

Federated Subscriptions (PR #3285)

High-Level Overview

What are Federated Subscriptions?

Client to Router Communication

Router to Subgraph Communication

Subscription Execution

Configuration

Usage Reporting

Advanced Features

Deduplication

Callback Mode

Contributors

v1.22.0-alpha.0

v1.21.0

🚀 Features

Restore HTTP payload size limit, make it configurable (Issue #2000)

Add support for empty auth prefixes (Issue #2909)

🐛 Fixes

GraphQL introspection errors are now 400 errors (Issue #3090)

Restore missing debug tools in "debug" Docker images (Issue #3249)

Federation v2.4.8 (Issue #3217, Issue #3227)

update Rhai to 1.15.0 to fix issue with hanging example test (Issue #3213)

🛠 Maintenance

chore: split out router events into its own module (PR #3235)

Simplify router service tests (PR #3259)

📚 Documentation

Improve example Rhai scripts for JWT Authentication (PR #3184)

🧪 Experimental

Expose the apollo compiler at the supergraph service level (internal) (PR #3200)

Query planner plugins (internal) (Issue #3150)

Contributors

v1.21.0-alpha.1

v1.21.0-alpha.0

v1.20.0

🚀 Features

Configurable histogram buckets for metrics (Issue #2333)

🐛 Fixes

Federation v2.4.7 (Issue #3170, Issue #3133)

Set the global allocator in the library crate, not just the executable (Issue #3126)

Add ca-certificates to our Docker image (Issue #3173)

Helm: Running of helm test no longer fails

Move curl dependency to separate layer in Docker image (Issue #3144)

🛠 Maintenance

Improve cargo-about license checking (Issue #3176)

Contributors

v1.19.1

🐛 Fixes

Fix router coprocessor deferred response buffering and change JSON body type from Object to String (Issue #3015)

Experimental: Query plan cache keys now include a hash of the query and operation name (Issue #2998)

Federation v2.4.6 (Issue #3133)

Error redaction for subgraphs now respects disabling it

Evaluate multiple keys matching a JWT criteria (Issue #3017)

🛠 Maintenance

chore(deps): xtask/ dependency updates (PR #3149)

Improve testability of the state_machine in integration tests

Contributors

v1.19.0

Add `--listen` to CLI args (PR #3296)

Add `ca-certificates` to our Docker image (Issue #3173)

Helm: Running of `helm test` no longer fails

Move `curl` dependency to separate layer in Docker image (Issue #3144)

Improve `cargo-about` license checking (Issue #3176)

chore(deps): `xtask/` dependency updates (PR #3149)

Improve testability of the `state_machine` in integration tests

GraphOS Enterprise: `require_authentication` option to reject unauthenticated requests (Issue #2866)

Use a parking-lot mutex in `Context` to avoid contention (Issue #2751)

Add private component to the `Context` structure (Issue #2800)

Adds an integration test for all YAML configuration files in `./examples` (Issue #2932)

Improve memory fragmentation and resource consumption by switching to `jemalloc` as the memory allocator on Linux (PR #2882)