Releases: apollographql/router
v1.8.0
📃 Configuration
Configuration changes will be automatically migrated on load. However, you should update your source configuration files as these will become breaking changes in a future major release.
Defer support graduates from preview (Issue #2368)
We're pleased to announce that @defer
support has been promoted to general availability in accordance with our product launch stages.
Defer is enabled by default in the Router, however if you had previously explicitly disabled defer support via configuration then you will need to update your configuration accordingly:
Before:
supergraph:
preview_defer_support: true
After:
supergraph:
defer_support: true
By @BrynCooke in #2378
Remove timeout
from OTLP exporter (Issue #2337)
A duplicative timeout
property has been removed from the telemetry.tracing.otlp
object since the batch_processor
configuration already contained a timeout
property. The Router will tolerate both options for now and this will be a breaking change in a future major release. Please update your configuration accordingly to reduce future work.
Before:
telemetry:
tracing:
otlp:
timeout: 5s
After:
telemetry:
tracing:
otlp:
batch_processor:
timeout: 5s
By @BrynCooke in #2338
🚀 Features
The Helm chart has graduated from prerelease to general availability (PR #2380)
As part of this release, we have promoted the Helm chart from its prerelease "release-candidate" stage to a "stable" version number. We have chosen to match the version of the Helm chart to the Router version, which is very agreeable with our automated Router releasing pipeline. This means the first stable version of the Helm chart will be 1.8.0
which will pair with Router 1.8.0 and subsequent versions will be in lock-step.
Emit hit/miss metrics for APQ, Query Planning and Introspection caches (Issue #1985)
Added metrics for caching.
Each cache metric contains a kind
attribute to indicate the kind of cache (query planner
, apq
, introspection
)
and a storage
attribute to indicate the backing storage e.g memory/disk.
The following buckets are exposed:
apollo_router_cache_hit_count
- cache hits.
apollo_router_cache_miss_count
- cache misses.
apollo_router_cache_hit_time
- cache hit duration.
apollo_router_cache_miss_time
- cache miss duration.
Example
# TYPE apollo_router_cache_hit_count counter
apollo_router_cache_hit_count{kind="query planner",new_test="my_version",service_name="apollo-router",storage="memory"} 2
# TYPE apollo_router_cache_hit_time histogram
apollo_router_cache_hit_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.001"} 2
apollo_router_cache_hit_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.005"} 2
apollo_router_cache_hit_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.015"} 2
apollo_router_cache_hit_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.05"} 2
apollo_router_cache_hit_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.1"} 2
apollo_router_cache_hit_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.2"} 2
apollo_router_cache_hit_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.3"} 2
apollo_router_cache_hit_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.4"} 2
apollo_router_cache_hit_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.5"} 2
apollo_router_cache_hit_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="1"} 2
apollo_router_cache_hit_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="5"} 2
apollo_router_cache_hit_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="10"} 2
apollo_router_cache_hit_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="+Inf"} 2
apollo_router_cache_hit_time_sum{kind="query planner",service_name="apollo-router",storage="memory"} 0.000236782
apollo_router_cache_hit_time_count{kind="query planner",service_name="apollo-router",storage="memory"} 2
# HELP apollo_router_cache_miss_count apollo_router_cache_miss_count
# TYPE apollo_router_cache_miss_count counter
apollo_router_cache_miss_count{kind="query planner",service_name="apollo-router",storage="memory"} 1
# HELP apollo_router_cache_miss_time apollo_router_cache_miss_time
# TYPE apollo_router_cache_miss_time histogram
apollo_router_cache_miss_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.001"} 1
apollo_router_cache_miss_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.005"} 1
apollo_router_cache_miss_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.015"} 1
apollo_router_cache_miss_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.05"} 1
apollo_router_cache_miss_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.1"} 1
apollo_router_cache_miss_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.2"} 1
apollo_router_cache_miss_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.3"} 1
apollo_router_cache_miss_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.4"} 1
apollo_router_cache_miss_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="0.5"} 1
apollo_router_cache_miss_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="1"} 1
apollo_router_cache_miss_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="5"} 1
apollo_router_cache_miss_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="10"} 1
apollo_router_cache_miss_time_bucket{kind="query planner",service_name="apollo-router",storage="memory",le="+Inf"} 1
apollo_router_cache_miss_time_sum{kind="query planner",service_name="apollo-router",storage="memory"} 0.000186783
apollo_router_cache_miss_time_count{kind="query planner",service_name="apollo-router",storage="memory"} 1
Add support for single instance Redis (Issue #2300)
Experimental caching via Redis now works with single Redis instances when configured with a single URL.
Support TLS connections to single instance Redis (Issue #2332)
TLS connections are now supported when connecting to single Redis instances. It is useful for connecting to hosted Redis providers where TLS is mandatory.
TLS connections for clusters are not supported yet, see Issue #2332 for updates.
🐛 Fixes
Correctly handle aliased __typename
fields (Issue #2330)
If you aliased a __typename
like in this example query:
{
myproducts: products {
total
__typename
}
_0___typename: __typename
}
Before this fix, _0___typename
was set to null
. Thanks to this fix it now properly returns Query
.
subgraph_request
span is now set as the parent of traces coming from subgraphs (Issue #2344)
Before this fix, the context injected in headers to subgraphs was wrong and not attached to the correct parent span id, causing it to appear disconnected when rendering the trace tree.
🛠 Maintenance
Simplify telemetry config code (Issue #2337)
This brings the telemetry plugin configuration closer to standards recommended in the YAML design guidance.
By @BrynCooke in #2338
Upgrade the clap
version in scaffold templates (Issue #2165)
Upgrade clap
dependency version to a version supporting the generation of scaffolded plugins via xtask.
Upgrade axum to 0.6.1
(PR #2303)
For more details about the new axum
release, please read the project's change log
Set the HTTP response content-type
as application/json
when returning GraphQL errors (Issue #2320)
When throwing a INVALID_GRAPHQL_REQUEST
error, it now specifies the expected content-type
header rather than omitting the header as it was prev...
v1.7.0
🚀 Features
Newly scaffolded projects now include a Dockerfile
(Issue #2295)
Custom Router binary projects created using our scaffolding tooling will now have a Dockerfile
emitted to facilitate building custom Docker containers.
By @o0Ignition0o in #2307
Apollo Uplink communication timeout is configurable (PR #2271)
The amount of time which can elapse before timing out when communicating with Apollo Uplink is now configurable via the APOLLO_UPLINK_TIMEOUT
environment variable and the --apollo-uplink-timeout
CLI flag, in a similar fashion to how the interval can be configured. It still defaults to 30 seconds.
By @o0Ignition0o in #2271
Query plan cache is pre-warmed using existing operations when the supergraph changes (Issue #2302, Issue #2308)
A new warmed_up_queries
configuration option has been introduced to pre-warm the query plan cache when the supergraph changes.
Under normal operation, query plans are cached to avoid the recomputation cost. However, when the supergraph changes, previously-planned queries must be re-planned to account for implementation changes in the supergraph, even though the query itself may not have changed. Under load, this re-planning can cause performance variations due to the extra computation work. To reduce the impact, it is now possible to pre-warm the query plan cache for the incoming supergraph, prior to changing over to the new supergraph. Pre-warming slightly delays the roll-over to the incoming supergraph, but allows the most-requested operations to not be impacted by the additional computation work.
To enable pre-warming, the following configuration can be introduced which sets warmed_up_queries
:
supergraph:
query_planning:
# Pre-plan the 100 most used operations when the supergraph changes. (Default is "0", disabled.)
warmed_up_queries: 100
experimental_cache:
in_memory:
# Sets the limit of entries in the query plan cache
limit: 512
Query planning was also updated to finish executing and setting up the cache, even if the response couldn't be returned to the client which is important to avoid throwing away computationally-expensive work.
🐛 Fixes
Propagate errors across inline fragments (PR #2304)
GraphQL errors are now correctly propagated across inline fragments.
By @o0Ignition0o in #2304
Only rebuild protos
if reports.proto
source changes
Apollo Studio accepts traces and metrics from Apollo Router via the Protobuf specification which lives in the reports.proto
file in the repository. With this contribution, we only re-build from the reports.proto
file when the file has actually changed, as opposed to doing it on every build which was occurring previously. This change saves build time for developers.
By @scottdouglas1989 in #2283
Return an error on duplicate keys in configuration (Issue #1428)
Repeat usage of the same keys in Router YAML can be hard to notice but indicate a misconfiguration which can cause unexpected behavior since only one of the values can be in effect. With this improvement, the following YAML configuration will raise an error at Router startup to alert the user of the misconfiguration:
telemetry:
tracing:
propagation:
jaeger: true
tracing:
propagation:
jaeger: false
In this particular example, the error produced would be:
ERROR duplicated keys detected in your yaml configuration: 'telemetry.tracing'
Return requested __typename
in initial chunk of a deferred response (Issue #1922)
The special-case __typename
field is no longer being treated incorrectly when requested at the root level on an operation which used @defer
. For example, the following query:
{
__typename
...deferedFragment @defer
}
fragment deferedFragment on Query {
slow
}
The Router now exhibits the correct behavior for this query with __typename
being returned as soon as possible in the initial chunk, as follows:
{"data":{"__typename": "Query"},"hasNext":true}
Log retriable Apollo Uplink failures at the debug
level (Issue #2004)
The log levels for messages pertaining to Apollo Uplink schema fetch failures are now emitted at debug
level to reduce noise since such failures do not indicate an actual error since they can be and are retried immediately.
Traces won't cause missing field-stats (Issue #2267)
Metrics are now correctly measured comprehensively and traces will obey the trace sampling configuration. Previously, if a request was sampled out of tracing it would not always contribute to metrics correctly. This was particularly problematic for users which had configured high sampling rates for their traces.
By @BrynCooke in #2277 and #2286
Replace default notify
watcher mechanism with PollWatcher
(Issue #2245)
We have replaced the default mechanism used by our underlying file-system notification library, notify
, to use PollWatcher
. This more aggressive change has been taken on account of continued reports of failed hot-reloading and follows up our previous replacement of hotwatch
. We don't have very demanding file watching requirements, so while PollWatcher
offers less sophisticated functionality and slightly slower reactivity, it is at least consistent on all platforms and should provide the best developer experience.
Preserve subgraph error's path
property when redacting subgraph errors (Issue #1818)
The path
property in errors is now preserved. Previously, error redaction was removing the error's path
property, which made debugging difficult but also made it impossible to correctly match errors from deferred responses to the appropriate fields in the requested operation. Since the response shape for the primary and deferred responses are defined from the client-facing "API schema", rather than the supergraph, this change will not result in leaking internal supergraph implementation details to clients and the result will be consistent, even if the subgraph which provides a particular field changes over time.
Use correct URL decoding for variables
in HTTP GET
requests (Issue #2248)
The correct URL decoding will now be applied when making a GET
request that passes in the variables
query string parameter. Previously, all '+' characters were being replaced with spaces which broke cases where the +
symbol was not merely an encoding symbol (e.g., ISO8601 date time values with timezone information).
🛠 Maintenance
Return additional details to client for invalid GraphQL requests (Issue #2301)
Additional context will be returned to clients in the error indicating the source of the error when an invalid GraphQL request is made. For example, passing a string instead of an object for the variables
property will now inform the client of the mistake, providing a better developer experience:
{
"errors": [
{
"message": "Invalid GraphQL request",
"extensions": {
"details": "failed to deserialize the request body into JSON: invalid type: string \"null\", expected a map at line 1 column 100",
"code": "INVALID_GRAPHQL_REQUEST"
}
}
]
}
OpenTelemetry spans to subgraphs now include the request URL (Issue #2280)
A new http.url
attribute has been attached to subgraph_request
OpenTelemetry trace spans which specifies the URL which the particular request was made to.
Errors returned to clients are now more consistently formed (Issue #2101)
We now return errors in a more consistent shape to those which were returned by Apollo Gateway and Apollo Server, and seen in the documentation. In particular, when available, a stable code
field will be included in the error's extensions
.
🧪 Experimental
Note
These features are subject to change slightly (usually, in terms of naming or interfaces) before graduating to general availability.
[Read mo...
v1.6.0
🚀 Features
Add support for experimental tooling (Issue #2136)
Display a message at startup listing used experimental_
configurations with related GitHub discussions.
It also adds a new cli command router config experimental
to display all available experimental configurations.
Re-deploy router pods if the SuperGraph configmap changes (PR #2223)
When setting the supergraph with the supergraphFile
variable a sha256
checksum is calculated and set as an annotation for the router pods. This will spin up new pods when the supergraph is mounted via config map and the schema has changed.
Note: It is preferable to not have --hot-reload
enabled with this feature since re-configuring the router during a pod restart is duplicating the work and may cause confusion in log messaging.
By @toneill818 in #2223
Tracing batch span processor is now configurable (Issue #2232)
Exporting traces often requires performance tuning based on the throughput of the router, sampling settings and ingestion capability of tracing ingress.
All exporters now support configuring the batch span processor in the router yaml.
telemetry:
apollo:
batch_processor:
scheduled_delay: 100ms
max_concurrent_exports: 1000
max_export_batch_size: 10000
max_export_timeout: 100s
max_queue_size: 10000
tracing:
jaeger|zipkin|otlp|datadog:
batch_processor:
scheduled_delay: 100ms
max_concurrent_exports: 1000
max_export_batch_size: 10000
max_export_timeout: 100s
max_queue_size: 10000
See the Open Telemetry docs for more information.
By @BrynCooke in #1970
Add hot-reload support for Rhai scripts (Issue #1071)
The router will "watch" your "rhai.scripts" directory for changes and prompt an interpreter re-load if changes are detected. Changes are defined as:
- creating a new file with a ".rhai" suffix
- modifying or removing an existing file with a ".rhai" suffix
The watch is recursive, so files in sub-directories of the "rhai.scripts" directory are also watched.
The Router attempts to identify errors in scripts before applying the changes. If errors are detected, these will be logged and the changes will not be applied to the runtime. Not all classes of error can be reliably detected, so check the log output of your router to make sure that changes have been applied.
Add support for working with multi-value header keys to Rhai (Issue #2211, Issue #2255)
Adds support for setting a header map key with an array. This causes the HeaderMap key/values to be appended() to the map, rather than inserted().
Adds support for a new values()
fn which retrieves multiple values for a HeaderMap key as an array.
Example use from Rhai as:
response.headers["set-cookie"] = [
"foo=bar; Domain=localhost; Path=/; Expires=Wed, 04 Jan 2023 17:25:27 GMT; HttpOnly; Secure; SameSite=None",
"foo2=bar2; Domain=localhost; Path=/; Expires=Wed, 04 Jan 2023 17:25:27 GMT; HttpOnly; Secure; SameSite=None",
];
response.headers.values("set-cookie"); // Returns the array of values
🐛 Fixes
Filter nullified deferred responses (Issue #2213)
@defer
spec updates mandates that a deferred response should not be sent if its path points to an element of the response that was nullified in a previous payload.
Return root __typename
when parts of a query with deferred fragment (Issue #1677)
With this query:
{
__typename
fast
...deferedFragment @defer
}
fragment deferedFragment on Query {
slow
}
You will receive the first response chunk:
{"data":{"__typename": "Query", "fast":0},"hasNext":true}
Wait for opentelemetry tracer provider to shutdown (PR #2191)
When we drop Telemetry we spawn a thread to perform the global opentelemetry trace provider shutdown. The documentation of this function indicates that "This will invoke the shutdown method on all span processors. span processors should export remaining spans before return". We should give that process some time to complete (5 seconds currently) before returning from the drop
. This will provide more opportunity for spans to be exported.
Dispatch errors from the primary response to deferred responses (Issue #1818, Issue #2185)
When errors are generated during the primary execution, some may also be assigned to deferred responses.
Reconstruct deferred queries with knowledge about fragments (Issue #2105)
When we are using @defer
, response formatting must apply on a subset of the query (primary or deferred), that is reconstructed from information provided by the query planner: a path into the response and a subselection. Previously, that path did not include information on fragment application, which resulted in query reconstruction issues if @defer
was used under a fragment application on an interface.
🛠 Maintenance
Improve plugin registration predictability (PR #2181)
This replaces ctor with linkme. ctor
enables rust code to execute before main
. This can be a source of undefined behaviour and we don't need our code to execute before main
. linkme
provides a registration mechanism that is perfect for this use case, so switching to use it makes the router more predictable, simpler to reason about and with a sound basis for future plugin enhancements.
it_rate_limit_subgraph_requests fixed (Issue #2213)
This test was failing frequently due to it being a timing test being run in a single threaded tokio runtime.
By @BrynCooke in #2218
Update reports.proto protobuf definition (PR #2247)
Update the reports.proto file, and change the prompt to update the file with the correct new location.
By @o0Ignition0o in #2247
Upgrade OpenTelemetry to 0.18 (Issue #1970)
Update to OpenTelemetry 0.18.
By @bryncooke and @bnjjj in #1970 and #2236
Remove spaceport (Issue #2233)
Removal significantly simplifies telemetry code and likely to increase performance and reliability.
By @bryncooke in #1970
Update to Rust 1.65 (Issue #2220)
Rust MSRV incremented to 1.65.
By @bryncooke in #2221 and #2240
Improve automated release (Pull #2220)
Improved the automated release to:
- Update the scaffold files
- Improve the names of prepare release steps in circle.
By @bryncooke in #2256
Use Elastic-2.0 license spdx (PR #2055)
Now that the Elastic-2.0 spdx is a valid identifier in the rust ecosystem, we can update the router references.
By @o0Ignition0o in #2054
Configuration
Protoc now required to build from source (Issue #1970)
Protoc is now required to build Apollo Router. Upgrading to Open Telemetry 0.18 has enabled us to upgrade tonic which in turn no longer bundles protoc.
Users must install it themselves https://grpc.io/docs/protoc-installation/.
By @bryncooke in #1970
Jaeger scheduled_delay moved to batch_processor->scheduled_delay ([Issue #2232](https://github.com/apollographql/router/issu...
v1.5.0
🚀 Features
Add configuration for trace ID (Issue #2080)
Trace ids can be propagated directly from a request header:
telemetry:
tracing:
propagation:
# If you have your own way to generate a trace id and you want to pass it via a custom request header
request:
header_name: my-trace-id
In addition, trace id can be exposed via a response header:
telemetry:
tracing:
experimental_response_trace_id:
enabled: true # default: false
header_name: "my-trace-id" # default: "apollo-trace-id"
Using this configuration you will have a response header called my-trace-id
containing the trace ID. It could help you to debug a specific query if you want to grep your log with this trace id to have more context.
Add configuration for logging and add more logs (Issue #1998)
By default, logs do not contain request body, response body or headers.
It is now possible to conditionally add this information for debugging and audit purposes.
Here is an example how you can configure it:
telemetry:
experimental_logging:
format: json # By default it's "pretty" if you are in an interactive shell session
display_filename: true # Display filename where the log is coming from. Default: true
display_line_number: false # Display line number in the file where the log is coming from. Default: true
# If one of these headers matches we will log supergraph and subgraphs requests/responses
when_header:
- name: apollo-router-log-request
value: my_client
headers: true # default: false
body: true # default: false
# log request for all requests/responses headers coming from Iphones
- name: user-agent
match: ^Mozilla/5.0 (iPhone*
headers: true
Provide multi-arch (amd64/arm64) Docker images for the Router (Issue #1932)
From 1.5.0 our Docker images will be multi-arch.
Add a supergraph configmap option to the helm chart (PR #2119)
Adds the capability to create a configmap containing your supergraph schema. Here's an example of how you could make use of this from your values.yaml and with the helm
install command.
extraEnvVars:
- name: APOLLO_ROUTER_SUPERGRAPH_PATH
value: /data/supergraph-schema.graphql
extraVolumeMounts:
- name: supergraph-schema
mountPath: /data
readOnly: true
extraVolumes:
- name: supergraph-schema
configMap:
name: "{{ .Release.Name }}-supergraph"
items:
- key: supergraph-schema.graphql
path: supergraph-schema.graphql
With that values.yaml content, and with your supergraph schema in a file name supergraph-schema.graphql, you can execute:
helm upgrade --install --create-namespace --namespace router-test --set-file supergraphFile=supergraph-schema.graphql router-test oci://ghcr.io/apollographql/helm-charts/router --version 1.0.0-rc.9 --values values.yaml
Configuration upgrades (Issue #2123)
Occasionally we will make changes to the Router yaml configuration format.
When starting the Router, if the configuration can be upgraded, it will do so automatically and display a warning:
2022-11-22T14:01:46.884897Z WARN router configuration contains deprecated options:
1. telemetry.tracing.trace_config.attributes.router has been renamed to 'supergraph' for consistency
These will become errors in the future. Run `router config upgrade <path_to_router.yaml>` to see a suggested upgraded configuration.
Note: If a configuration has errors after upgrading then the configuration will not be upgraded automatically.
From the CLI users can run:
router config upgrade <path_to_router.yaml>
to output configuration that has been upgraded to match the latest config format.router config upgrade --diff <path_to_router.yaml>
to output a diff e.g.
telemetry:
apollo:
client_name_header: apollographql-client-name
metrics:
common:
attributes:
- router:
+ supergraph:
request:
header:
- named: "1" # foo
There are situations where comments and whitespace are not preserved.
By @bryncooke in #2116, #2162
Experimental 🥼 subgraph request retry (Issue #338, Issue #1956)
Implements subgraph request retries, using Finagle's retry buckets algorithm:
- it defines a minimal number of retries per second (
min_per_sec
, default is 10 retries per second), to
bootstrap the system or for low traffic deployments - for each successful request, we add a "token" to the bucket, those tokens expire after
ttl
(default: 10 seconds) - the number of available additional retries is a part of the number of tokens, defined by
retry_percent
(default is 0.2)
Request retries are disabled by default on mutations.
This is activated in the traffic_shaping
plugin, either globally or per subgraph:
traffic_shaping:
all:
experimental_retry:
min_per_sec: 10
ttl: 10s
retry_percent: 0.2
retry_mutations: false
subgraphs:
accounts:
experimental_retry:
min_per_sec: 20
Experimental 🥼 Caching configuration (Issue #2075)
Split Redis cache configuration for APQ and query planning:
supergraph:
apq:
experimental_cache:
in_memory:
limit: 512
redis:
urls: ["redis://..."]
query_planning:
experimental_cache:
in_memory:
limit: 512
redis:
urls: ["redis://..."]
@defer
Apollo tracing support (Issue #1600)
Added Apollo tracing support for queries that use @defer
. You can now view traces in Apollo Studio as normal.
By @bryncooke in #2190
🐛 Fixes
Router debug Docker images now run under the control of heaptrack (Issue #2135)
From 1.5.0, our debug Docker image will invoke the router under the control of heaptrack. We are making this change to make it simple for users to investigate potential memory issues with the Router.
Do not run debug images in performance sensitive contexts. The tracking of memory allocations will significantly impact performance. In general, the debug image should only be used in consultation with Apollo engineering and support.
Look at our documentation for examples of how to use the image in either Docker or Kubernetes.
Fix panic when dev mode enabled with empty config file (Issue #2182)
If you're running the Router with dev mode with an empty config file, it will no longer panic
Fix missing apollo tracing variables (Issue #2186)
Send variable values had no effect. This is now fixed.
telemetry:
apollo:
send_variable_values: all
By @bryncooke in #2190
fix build_docker_image.sh script when using default repo (PR #2163)
Adding the -r
flag recently broke the existing functionality to build from the default repo using -b
. This fixes that.
Improve errors when subgraph returns non-GraphQL response with a non-2xx status code (Issue #2117)
The error response will now contain the status code and status name. Example: HTTP fetch failed from 'my-service': 401 Unauthorized
handle mutations containing @defer
(Issue #2099)
The Router generates partial query shapes corresponding to the primary and deferred responses,
to validate the data sent back to the client. Those query shapes were invalid for mutations.
Experimental 🥼 APQ and query planner Redis caching fixes (PR #2176)
- use a null byte as separator in Redis keys
- handle Redis c...
v1.4.0
🚀 Features
Add support for returning different HTTP status codes in Rhai (Issue #2023)
It is now possible to return different HTTP status codes when raising an exception in Rhai. You do this by providing an object map with two keys: status
and message
, rather than merely a string as was the case previously.
throw #{
status: 403,
message: "I have raised a 403"
};
This example will short-circuit request/response processing and return with an HTTP status code of 403 to the client and also set the error message accordingly.
It is still possible to return errors using the current pattern, which will continue to return HTTP status code 500 as previously:
throw "I have raised an error";
It is not currently possible to return a 200 status code using this pattern. If you try, it will be implicitly converted into a 500 error.
Add support for urlencode()
/ decode()
in Rhai (Issue #2052)
Two new functions, urlencode()
and urldecode()
may now be used to URL-encode or URL-decode strings, respectively.
Experimental 🥼 External cache storage in Redis (PR #2024)
We are experimenting with introducing external storage for caches in the Router, which will provide a foundation for caching things like automated persisted queries (APQ) amongst other future-looking ideas. Our initial implementation supports a multi-level cache hierarchy, first attempting an in-memory LRU-cache, proceeded by a Redis Cluster backend.
As this is still experimental, it is only available as an opt-in through a Cargo feature-flag.
By @garypen and @Geal in #2024
Expose query_plan
to ExecutionRequest
in Rhai (PR #2081)
You can now read the query-plan from an execution request by accessing request.query_plan
. Additionally, request.context
also now supports the Rhai in
keyword.
🐛 Fixes
Move error messages about nullifying into extensions
(Issue #2071)
The Router was previously creating and returning error messages in errors
when nullability rules had been triggered (e.g., when a non-nullable field was null
, it nullifies the parent object). These are now emitted into a valueCompletion
portion of the extensions
response.
Adding those messages in the list of errors
was potentially redundant and resulted in failures by clients (such as the Apollo Client error policy, by default) which would otherwise have expected nullified fields as part of normal operation execution. Additionally, the subgraph could already add such an error message indicating why a field was null which would cause the error to be doubled.
Fix Float
input-type coercion for default values with values larger than 32-bit (Issue #2087)
A regression has been fixed which caused the Router to reject integers larger than 32-bits used as the default values on Float
fields in input types.
In other words, the following will once again work as expected:
input MyInputType {
a_float_input: Float = 9876543210
}
By @o0Ignition0o in #2090
Assume Accept: application/json
when no Accept
header is present Issue #1990)
The Accept
header means */*
when it is absent, and despite efforts to fix this previously, we still were not always doing the correct thing.
@skip
and @include
implementation for root-level fragment use (Issue #2072)
The @skip
and @include
directives are now implemented for both inline fragments and fragment spreads at the top-level of operations.
🛠 Maintenance
Use debian:bullseye-slim
as our base Docker image (PR #2085)
A while ago, when we added compression support to the router, we discovered that the Distroless base-images we were using didn't ship with a copy of libz.so.1
. We addressed that problem by copying in a version of the library from the Distroless image (Java) which does ship it. While that worked, we found challenges in adding support for both aarch64
and amd64
Docker images that would make it less than ideal to continue using those Distroless images.
Rather than persist with this complexity, we've concluded that it would be better to just use a base image which ships with libz.so.1
, hence the change to debian:bullseye-slim
. Those images are still quite minimal and the resulting images are similar in size.
Update apollo-parser
to v0.3.2
(PR #2103)
This updates our dependency on our apollo-parser
package which brings a few improvements, including more defensive parsing of some operations. See its CHANGELOG in the apollo-rs
repository for more details.
📚 Documentation
Fix example helm show values
command (PR #2088)
The helm show vaues
command needs to use the correct Helm chart reference oci://ghcr.io/apollographql/helm-charts/router
.
v1.3.0
🚀 Features
Add support for DHAT-based heap profiling (PR #1829)
The dhat-rs crate provides DHAT-style heap profiling. We have added two compile-time features, dhat-heap
and dhat-ad-hoc
, which leverage this ability.
Add trace_id
in logs to correlate entries from the same request (Issue #1981)
A trace_id
is now added to each log line to help correlate log entries to specific requests. The value for this property will be automatically inherited from any enabled distributed tracing headers, such as those listed in our Tracing propagation header documentation (e.g., Jaeger, Zipkin, Datadog, etc.).
In the event that a trace_id
was not inherited from a propagated header, the Router will originate a trace_id
and also propagate that value to subgraphs to enable tracing in subgraphs.
Here is an example of the trace_id
appearing in plain-text log output:
2022-10-21T15:17:45.562553Z ERROR [trace_id=5e6a6bda8d0dca26e5aec14dafa6d96f] apollo_router::services::subgraph_service: fetch_error="hyper::Error(Connect, ConnectError(\"tcp connect error\", Os { code: 111, kind: ConnectionRefused, message: \"Connection refused\" }))"
2022-10-21T15:17:45.565768Z ERROR [trace_id=5e6a6bda8d0dca26e5aec14dafa6d96f] apollo_router::query_planner::execution: Fetch error: HTTP fetch failed from 'accounts': HTTP fetch failed from 'accounts': error trying to connect: tcp connect error: Connection refused (os error 111)
And an exmaple of the trace_id
appearing in JSON-formatted log output in a similar scenario:
{"timestamp":"2022-10-26T15:39:01.078260Z","level":"ERROR","fetch_error":"hyper::Error(Connect, ConnectError(\"tcp connect error\", Os { code: 111, kind: ConnectionRefused, message: \"Connection refused\" }))","target":"apollo_router::services::subgraph_service","filename":"apollo-router/src/services/subgraph_service.rs","line_number":182,"span":{"name":"subgraph"},"spans":[{"trace_id":"5e6a6bda8d0dca26e5aec14dafa6d96f","name":"request"},{"name":"supergraph"},{"name":"execution"},{"name":"parallel"},{"name":"fetch"},{"name":"subgraph"}]}
{"timestamp":"2022-10-26T15:39:01.080259Z","level":"ERROR","message":"Fetch error: HTTP fetch failed from 'accounts': HTTP fetch failed from 'accounts': error trying to connect: tcp connect error: Connection refused (os error 111)","target":"apollo_router::query_planner::execution","filename":"apollo-router/src/query_planner/execution.rs","line_number":188,"span":{"name":"parallel"},"spans":[{"trace_id":"5e6a6bda8d0dca26e5aec14dafa6d96f","name":"request"},{"name":"supergraph"},{"name":"execution"},{"name":"parallel"}]}
Reload configuration when receiving the SIGHUP signal (Issue #35)
The Router will now reload its configuration when receiving the SIGHUP signal. This signal is only supported on *nix platforms,
and only when a configuration file was passed to the Router initially at startup.
🐛 Fixes
Fix the deduplication logic in deduplication caching (Issue #1984)
Under load, we found it was possible to break the router de-duplication logic and leave orphaned entries in the waiter map. This fixes the de-duplication logic to prevent this from occurring.
Follow back-off instructions from Studio Uplink (Issue #1494 Issue #1539)
When operating in a Managed Federation configuration and fetching the supergraph from Apollo Uplink, the Router will now react differently depending on the response from Apollo Uplink, rather than retrying incessantly:
- Not attempt to retry when met with unrecoverable conditions (e.g., a Graph that does not exist).
- Back-off on retries when the infrastructure asks for a longer retry interval.
Fix the rhai SDL print
function (Issue #2005)
Fixes the print
function exposed to rhai which was broken due to a recent change that was made in the way we pass SDL (schema definition language) to plugins.
By @fernando-apollo in #2007
Export router_factory::Endpoint
(PR #2007)
We now export the router_factory::Endpoint
struct that was inadvertently unexposed. Without access to this struct, it was not possible to implement the web_endpoints
trait in plugins.
By @scottdouglas1989 in #2007
Validate default values for input object fields (Issue #1979)
When validating variables, the Router now uses graph-specified default values for object fields, if applicable.
Address regression when sending gRPC to localhost
(Issue #2036)
We again support sending unencrypted gRPC tracing and metrics data to localhost
. This follows-up on a regression which occurred in the previous release which addressed a limitation which prevented sending gRPC to TLS-secured endpoints.
Applying a proper fix was complicated by an upstream issue (opentelemetry-rust#908) which incorrectly assumes https
in the absence of a more-specific protocol/schema, contrary to the OpenTelmetry specification which indicates otherwise.
The Router will now detect and work-around this upstream issue by explicitly setting the full, correct endpoint URLs when not specified in config.
In addition:
- Basic TLS-encyrption will be enabled when the endpoint scheme is explicitly
https
. - A warning will be emitted if the endpoint port is 443 but no TLS config is specified since most traffic on port 443 is expected to be encrypted.
By @BrynCooke in https://github.com/apollographql/router/pull/#2048
🛠 Maintenance
Apply Tower best-practice to "inner" Service cloning (PR #2030)
We found our Service
readiness checks could be improved by following the Tower project's recommendations for cloning inner Services.
Split the configuration file implementation into modules (Issue #1790)
The internals of the implementation for the configuration have been modularized to facilitate on-going development. There should be no impact to end-users who are only using YAML to configure their Router.
Apply traffic-shaping directly to supergraph
and subgraph
(PR #2034)
The plugin infrastructure works on BoxService
instances and makes no guarantee on plugin ordering. The traffic shaping plugin needs a clonable inner service, and should run right before calling the underlying service. We'e changed the traffic plugin application so it can work directly on the underlying service. The configuration remains the same since this is still implemented as a plugin.
📚 Documentation
Remove references to Git submodules from DEVELOPMENT.md
(Issue #2012)
We've removed the instructions from our development documentation which guide users to familiarize themselves with and clone Git submodules when working on the Router source itself. This follows-up on the removal of the modules themselves in PR #1856.
v1.2.1
🐛 Fixes
Update to Federation v2.1.4 (PR #1994)
In addition to general Federation bug-fixes, this update should resolve a case (seen in Issue #1962) where a @defer
directives which had been previously present in a Supergraph were causing a startup failure in the Router when we were trying to generate an API schema in the Router with @defer
.
Assume Accept: application/json
when no Accept
header is present (Issue #1995)
the Accept
header means */*
when it is absent.
Fix OpenTelemetry OTLP gRPC (Issue #1976)
OpenTelemetry (OTLP) gRPC failures involving TLS errors have been resolved against external APMs: including Datadog, NewRelic and Honeycomb.io.
By @BrynCooke in https://github.com/apollographql/router/pull/#1977
Prefix the Prometheus metrics with apollo_router_
(Issue #1915)
Correctly prefix Prometheus metrics with the apollo_router
prefix, per convention.
- http_requests_error_total{message="cannot contact the subgraph",service_name="apollo-router",subgraph="my_subgraph_name_error",subgraph_error_extended_type="SubrequestHttpError"} 1
+ apollo_router_http_requests_error_total{message="cannot contact the subgraph",service_name="apollo-router",subgraph="my_subgraph_name_error",subgraph_error_extended_type="SubrequestHttpError"} 1
Fix --hot-reload
in Kubernetes and Docker (Issue #1476)
The --hot-reload
flag now chooses a file event notification mechanism at runtime. The exact mechanism is determined by the notify
crate.
Fix a coercion rule that failed to validate 64-bit integers (PR #1951)
Queries that passed 64-bit integers for Float
input variables would were failing to validate despite being valid.
By @o0Ignition0o in #1951
Prometheus: make sure apollo_router_http_requests_error_total
and apollo_router_http_requests_total
are incremented. (PR #1953)
This affected two different metrics differently:
-
The
apollo_router_http_requests_error_total
metric only incremented for requests that would be anINTERNAL_SERVER_ERROR
in the Router (the service stack returning aBoxError
). This meant that GraphQL validation errors were not increment this counter. -
The
apollo_router_http_requests_total
metric would only increment for successful requests despite the fact that the Prometheus documentation suggests this should be incremented regardless of whether the request succeeded or not.
This PR makes sure we always increment apollo_router_http_requests_total
and we increment apollo_router_http_requests_error_total
when the status code is 4xx or 5xx.
By @o0Ignition0o in #1953
Set no_delay
and keepalive
on subgraph requests Issue #1905)
This re-introduces these parameters which were incorrectly removed in a previous pull request.
🛠 Maintenance
Improve the stability of some flaky tests (PR #1972)
The trace and rate limiting tests have been sporadically failing in our CI environment. The root cause was a race-condition in the tests so the tests have been made more resilient to reduce the number of failures.
By @garypen in #1972 and #1974
Update docker-compose
and Dockerfile
s now that the submodules have been removed (PR #1950)
We recently removed Git submodules from this repository but we didn't update various docker-compose.yml
files.
This PR adds new Dockerfile
s and updates existing docker-compose.yml
files so we can run integration tests (and the fuzzer) without needing to git clone
and set up the Federation and federation-demo
repositories.
By @o0Ignition0o in #1950
Fix logic around Accept
headers and multipart responses (PR #1923)
If the Accept
header contained multipart/mixed
, even with other alternatives like application/json
,
a query with a single response was still sent as multipart, which made Apollo Studio Explorer fail on the initial introspection query.
This changes the logic so that:
- If the client has indicated an
accept
ofapplication/json
or*/*
and there is a single response, it will be delivered ascontent-type: application/json
. - If there are multiple responses or the client only accepts
multipart/mixed
, we will sendcontent-type: multipart/mixed
response. This will occur even if there is only one response. - Otherwise, we will return an HTTP status code of
406 Not Acceptable
.
@defer
: duplicated errors across incremental items (Issue #1834, Issue #1818)
If a deferred response contains incremental responses, the errors should be dispatched in each increment according to the error's path.
Our Docker images are now linked to our GitHub repository per OCI-standards (PR #1958)
The org.opencontainers.image.source
annotation has been added to our Dockerfile
s and published Docker image in order to map the published image to our GitHub repository.
By @ndthanhdev in #1958
v1.2.0
❗ BREAKING ❗
Note the breaking change is not for the Router itself, but for the Router helm chart which is still 1.0.0-rc.5
Remove support for rhai.input_file
from the helm chart (Issue #1826)
The existing rhai.input_file
mechanism doesn't really work for most helm use cases. This PR removes this mechanism and and encourages the use of the extraVolumes/extraVolumeMounts
mechanism with rhai.
Example: Create a configmap which contains your rhai scripts.
apiVersion: v1
kind: ConfigMap
metadata:
name: rhai-config
labels:
app.kubernetes.io/name: rhai-config
app.kubernetes.io/instance: rhai-config
data:
main.rhai: |
// Call map_request with our service and pass in a string with the name
// of the function to callback
fn subgraph_service(service, subgraph) {
print(`registering request callback for ${subgraph}`);
const request_callback = Fn("process_request");
service.map_request(request_callback);
}
// This will convert all cookie pairs into headers.
// If you only wish to convert certain cookies, you
// can add logic to modify the processing.
fn process_request(request) {
// Find our cookies
if "cookie" in request.headers {
print("adding cookies as headers");
let cookies = request.headers["cookie"].split(';');
for cookie in cookies {
// Split our cookies into name and value
let k_v = cookie.split('=', 2);
if k_v.len() == 2 {
// trim off any whitespace
k_v[0].trim();
k_v[1].trim();
// update our headers
// Note: we must update subgraph.headers, since we are
// setting a header in our sub graph request
request.subgraph.headers[k_v[0]] = k_v[1];
}
}
} else {
print("no cookies in request");
}
}
my-module.rhai: |
fn process_request(request) {
print("processing a request");
}
Note how the data represents multiple rhai source files. The module code isn't used, it's just there to illustrate multiple files in a single configmap.
With that configmap in place, the helm chart can be used with a values file that contains:
router:
configuration:
rhai:
scripts: /dist/rhai
main: main.rhai
extraVolumeMounts:
- name: rhai-volume
mountPath: /dist/rhai
readonly: true
extraVolumes:
- name: rhai-volume
configMap:
name: rhai-config
The configuration tells the router to load the rhai script main.rhai
from the directory /dist/rhai
(and load any imported modules from /dist/rhai)
This will mount the confimap created above in the /dist/rhai
directory with two files:
main.rhai
my-module.rhai
🚀 Features
Expose the TraceId functionality to rhai (Issue #1935)
A new function, traceid(), is exposed to rhai scripts which may be used to retrieve a unique trace id for a request. The trace id is an opentelemetry span id.
fn supergraph_service(service) {
try {
let id = traceid();
print(`id: ${id}`);
}
catch(err)
{
// log any errors
log_error(`span id error: ${err}`);
}
}
🐛 Fixes
Fix studio reporting failures (Issue #1903)
The root cause of the issue was letting the server component of spaceport close silently during a re-configuration or schema reload. This fixes the issue by keeping the server component alive as long as the client remains connected.
Additionally, recycled spaceport connections are now re-connected to spaceport to further ensure connection validity.
Also make deadpool sizing constant across environments (#1893)
Update apollo-parser
to v0.2.12 (PR #1921)
Correctly lexes and creates an error token for unterminated GraphQL StringValue
s with unicode and line terminator characters.
traffic_shaping.all.deduplicate_query
was not correctly set (PR #1901)
Due to a change in our traffic_shaping configuration the deduplicate_query
field for all subgraph wasn't set correctly.
🛠 Maintenance
Fix hpa yaml for appropriate kubernetes versions (#1908)
Correct schema for autoscaling/v2beta2 and autoscaling/v2 api versions of the
HorizontalPodAutoscaler within the helm chart
By @damienpontifex in #1914
v1.1.0
🚀 Features
Build, test and publish binaries for aarch64-unknown-linux-gnu
architecture (Issue #1192)
We're now testing and building aarch64-unknown-linux-gnu
binaries in our release pipeline and publishing those build artifacts as releases. These will be installable in the same way as our existing installation instructions.
By @EverlastingBugstopper in #1907
Add ability to specify repository location in "DIY" Docker builds (PR #1904)
The new -r
flag allows a developer to specify the location of a repository when building a diy docker image. Handy for developers with local repositories.
Support serviceMonitor
in Helm chart
kube-prometheus-stack
ignores scrape annotations, so a serviceMonitor
Custom Resource Definition (CRD) is required to scrape a given target to avoid scrape_configs
.
Add support for dynamic header injection (Issue #1755)
The following are now possible in our YAML configuration for headers
:
-
Insert static header
headers: all: # Header rules for all subgraphs request: - insert: name: "sent-from-our-apollo-router" value: "indeed"
-
Insert header from context
headers: all: # Header rules for all subgraphs request: - insert: name: "sent-from-our-apollo-router-context" from_context: "my_key_in_context"
-
Insert header from request body
headers: all: # Header rules for all subgraphs request: - insert: name: "sent-from-our-apollo-router-request-body" path: ".operationName" # It's a JSON path query to fetch the operation name from request body default: "UNKNOWN" # If no operationName has been specified
🐛 Fixes
Fix external secret support in our Helm chart (Issue #1750)
If an external secret is specified, e.g.:
helm install --set router.managedFederation.existingSecret="my-secret-name" <etc...>
...then the router should be deployed and configured to use the existing secret.
Do not erase errors when missing _entities
(Issue #1863)
In a federated query, if the subgraph returned a response with errors
and a null
or absent data
field, the Router was ignoring the subgraph error and instead returning an error complaining about the missing _entities
field.
The Router will now aggregate the subgraph error and the missing _entities
error.
Fix Prometheus annotation and healthcheck default
The Prometheus annotation is breaking on a helm upgrade
so this fixes the template and also sets defaults. Additionally, defaults are now set for health-check
's listen
to be 0.0.0.0:8088
within the Helm chart.
Move response formatting to the execution service (PR #1771)
The response formatting process (in which response data is filtered according to deferred responses subselections and the API schema) was being executed in the supergraph
service. This was a bit late since it resulted in the execution
service returning a stream of invalid responses leading to the execution plugins operating on invalid data.
Hide footer from "homepage" landing page (PR #1900)
Hides some incorrect language about customization on the landing page. Currently to customize the landing page it requires additional support.
🛠 Maintenance
Update to Federation 2.1.3 (Issue #1880)
This brings in Federation 2.1.3 to bring in updates to @apollo/federation
via the relevant bump in router-bridge
.
Update reqwest
dependency to resolve DNS resolution failures (Issue #1899)
This should resolve intermittent failures to resolve DNS in Uplink which were occurring due to an upstream bug in the reqwest
library.
Remove span details from log records (PR #1896)
Prior to this change, span details were written to log files. This was unwieldy and contributed to log bloat. Spans and logs are still linked in trace aggregators, such as jaeger, and this change simply affects the content of the written to the console output.
Change span attribute names in OpenTelemetry to be more consistent (PR #1876)
The span attributes in our OpenTelemetry tracing spans are corrected to be consistently namespaced with attributes that are compliant with the OpenTelemetry specification.
Have CI use rust-toolchain.toml and not install another redudant toolchain (Issue #1313)
Avoids redundant work in CI and makes the YAML configuration less mis-leading.
Query plan execution refactoring (PR #1843)
This splits the query plan execution in multiple modules to make the code more manageable.
Remove Buffer
from APQ (PR #1641)
This removes tower::Buffer
usage from the Automated Persisted Queries (APQ) implementation to improve reliability.
Remove Buffer
from query deduplication (PR #1889)
This removes tower::Buffer
usage from the query deduplication implementation to improve reliability.
Set MSRV to 1.63.0 (PR #1886)
We compile and test with 1.63.0 on CI at the moment, so it is our de-facto Minimum Supported Rust Version (MSRV).
Setting rust-version
in Cargo.toml
provides a more helpful error message when using an older version rather than unexpected compilation errors.
By @SimonSapin in #1886
v1.0.0
Note
🤸 We've reached our initial v1.0.0 release. This project adheres to Semantic Versioning v2.0.0 and our version numbers follow the practices outlined in that specification. If you're updating from
1.0.0-rc.2
there is one breaking change to the API that is unlikely to affect you.The migration steps from each pre-1.0 version will vary depending on which release you're coming from. To update from previous versions, you can consult the Release Notes for whichever version you are running and work your way to v1.0.0.
Our documentation has been updated to match our current v1.x state. In general, if you run the Router with your existing configuration, you should receive output indicating any values which are no longer valid and find their v1.0.0 equivalent in the updated documentation, or by searching the
CHANGELOG.md
for the prior configuration option to find when it changed.Lastly, thank you for all of your positive and constructive feedback in our pre-1.0 stages. If you encounter any questions or feedback while updating to v1.0.0, please search for or open a GitHub Discussion or file a GitHub Issue if you find something working differently than it's documented.
We're excited about the path ahead! 👐
❗ BREAKING ❗
Removed Request::from_bytes()
from public API (Issue #1855)
We've removed Request::from_bytes()
from the public API. We were no longer using it internally and we hardly expect anyone external to have been relying on it so it was worth the remaining breaking change prior to v1.0.0.
We discovered this function during an exercise of documenting our entire public API. While we considered keeping it, it didn't necessarily meet our requirements for shipping it in the public API. It's internal usage was removed in [d147f97d
](d147f97d as part of PR #429.
We're happy to consider re-introducing this in the future (it even has a matching Response::from_bytes()
which it composes against nicely!), but we thought it was best to remove it for the time-being.
🚀 Features
Reintroduce health check (Issue #1861)
We have re-introduced a health check at the /health
endpoint on a dedicated port that is not exposed on the default GraphQL execution port (4000
) but instead on port 8088
. We recommend updating from the previous health-point suggestion by consulting our health check configuration documentation. This health check endpoint will act as an "overall" health check for the Router and we intend to add separate "liveliness" and "readiness" checks on their own dedicated endpoints (e.g., /health/live
and /health/ready
) in the future. At that time, this root /health
check will aggregate all other health checks to provide an overall health status however, today, it is simply a "liveliness" check and we have not defined "readiness". We also intend to use port 8088
for other ("internal") functionality in the future, keeping the GraphQL execution endpoint dedicated to serving external client requests.
As for some additional context as to why we've brought it back so quickly: We had previously removed the health check we had been offering in PR #1766 because we wanted to do some additional configurationd design and lean into a new "admin port" (8088
). As a temporary solution, we offered the instruction to send a GET
query to the Router with a GraphQL query. After some new learnings and feedback, we've had to re-visit that conversation earlier than we expected!
Due to default CSRF protections enabled in the Router, GET
requests need to be accompanied by certain HTTP headers in order to disqualify them as being CORS-preflightable requests. While sending the additional header was reasonable straightforward in Kubernetes, other environments (including Google Kubernetes Engine's managed load balancers) didn't offer the ability to send those necessary HTTP headers along with their GET
queries. So, the /health
endpoint is back.
The health check endpoint is now exposed on 127.0.0.1:8088/health
by default, and its listen
socket address can be changed in the YAML configuration:
health-check:
listen: 127.0.0.1:8088 # default
enabled: true # default
The previous health-check suggestion (i.e., GET /?query={__typename}
) will still work, so long as your infrastructure supports sending custom HTTP headers with HTTP requests. Again though, we recommend updating to the new health check.
By @o0Ignition0o and @BrynCooke in #1859
🐛 Fixes
Remove apollo_private
and OpenTelemetry entries from logs (Issue #1862)
This change removes some apollo_private
and OpenTelemetry (e.g., otel.kind
) fields from the logs.
By @garypen and @bnjjj in #1868
Update and validate Dockerfile
files (Issue #1854)
Several of the Dockerfile
s in the repository were out-of-date with respect to recent configuration changes. We've updated the configuration files and extended our tests to catch this automatically in the future.
🛠 Maintenance
Disable Deno snapshotting when building inside docs.rs
This works around V8 linking errors and caters to specific build-environment constraints and requirements that exist on the Rust documentation site docs.rs
.
By @SimonSapin in #1847
Add the Studio Uplink schema to the repository, with a test checking that it is up to date.
Previously we were downloading the Apollo Studio Uplink schema (which is used for fetching Managed Federation schema updates) at compile-time, which would fail in build environments without Internet access, like docs.rs
' build system.
If an update is needed, the test failure will print a message with the command to run.
By @SimonSapin in #1847