Releases: apollographql/router
v1.51.0-rc.0
1.51.0-rc.0
v2.0.0-alpha.0
2.0.0-alpha.0
v1.50.0
🚀 Features
Support local persisted query manifests for use with offline licenses (Issue #4587)
Adds experimental support for passing persisted query manifests to use instead of the hosted Uplink version.
For example:
persisted_queries:
enabled: true
log_unknown: true
experimental_local_manifests:
- ./persisted-query-manifest.json
safelist:
enabled: true
require_id: false
Support conditions on standard telemetry events (Issue #5475)
Enables setting conditions on standard events.
For example:
telemetry:
instrumentation:
events:
router:
request:
level: info
condition: # Only log the router request if you sent `x-log-request` with the value `enabled`
eq:
- request_header: x-log-request
- "enabled"
response: off
error: error
# ...
Not supported for batched requests.
By @bnjjj in #5476
Make status_code
available for router_service
responses in Rhai scripts (Issue #5357)
Adds response.status_code
on Rhai router_service
responses. Previously, status_code
was only available on subgraph_service
responses.
For example:
fn router_service(service) {
let f = |response| {
if response.is_primary() {
print(response.status_code);
}
};
service.map_response(f);
}
By @IvanGoncharov in #5358
Add new values for the supergraph query
selector (PR #5433)
Adds support for four new values for the supergraph query
selector:
aliases
: the number of aliases in the querydepth
: the depth of the queryheight
: the height of the queryroot_fields
: the number of root fields in the query
You can use this data to understand how your graph is used and to help determine where to set limits.
For example:
telemetry:
instrumentation:
instruments:
supergraph:
'query.depth':
description: 'The depth of the query'
value:
query: depth
unit: unit
type: histogram
Add the ability to drop metrics using otel views (PR #5531)
You can drop specific metrics if you don't want these metrics to be sent to your APM using otel views.
telemetry:
exporters:
metrics:
common:
service_name: apollo-router
views:
- name: apollo_router_http_request_duration_seconds # Instrument name you want to edit. You can use wildcard in names. If you want to target all instruments just use '*'
aggregation: drop
Add operation_name
selector for router service in custom telemetry (PR #5392)
Adds an operation_name
selector for the router service.
Previously, accessing operation_name
was only possible through the response_context
router service selector.
For example:
telemetry:
instrumentation:
instruments:
router:
http.server.request.duration:
attributes:
graphql.operation.name:
operation_name: string
🐛 Fixes
Fix Cache-Control aggregation and age calculation in entity caching (PR #5463)
Enhances the reliability of caching behaviors in the entity cache feature by:
- Ensuring the proper calculation of
max-age
ands-max-age
fields in theCache-Control
header sent to clients. - Setting appropriate default values if a subgraph does not provide a
Cache-Control
header. - Guaranteeing that the
Cache-Control
header is aggregated consistently, even if the plugins is disabled entirely or on specific subgraphs.
Fix telemetry events when trace isn't sampled and preserve attribute types (PR #5464)
Improves accuracy and performance of event telemetry by:
- Displaying custom event attributes even if the trace is not sampled
- Preserving original attribute type instead of converting it to string
- Ensuring
http.response.body.size
andhttp.request.body.size
attributes are treated as numbers, not strings
⚠️ Exercise caution if you have monitoring enabled on your logs, as attribute types may have changed. For example, attributes likehttp.response.status_code
are now numbers (200
) instead of strings ("200"
).
Enable coprocessors for subscriptions (PR #5542)
Ensures that coprocessors correctly handle subscriptions by preventing skipped data from being overwritten.
Improve accuracy of query_planning.plan.duration
(PR #5)
Previously, the apollo.router.query_planning.plan.duration
metric inaccurately included additional processing time beyond query planning. The additional time included pooling time, which is already accounted for in the metric. After this update, apollo.router.query_planning.plan.duration now accurately reflects only the query planning duration without additional processing time.
For example, before the change, metrics reported:
2024-06-21T13:37:27.744592Z WARN apollo.router.query_planning.plan.duration 0.002475708
2024-06-21T13:37:27.744651Z WARN apollo.router.query_planning.total.duration 0.002553958
2024-06-21T13:37:27.748831Z WARN apollo.router.query_planning.plan.duration 0.001635833
2024-06-21T13:37:27.748860Z WARN apollo.router.query_planning.total.duration 0.001677167
Post-change metrics now accurately reflect:
2024-06-21T13:37:27.743465Z WARN apollo.router.query_planning.plan.duration 0.00107725
2024-06-21T13:37:27.744651Z WARN apollo.router.query_planning.total.duration 0.002553958
2024-06-21T13:37:27.748299Z WARN apollo.router.query_planning.plan.duration 0.000827
2024-06-21T13:37:27.748860Z WARN apollo.router.query_planning.total.duration 0.001677167
By @xuorig and @lrlna in #5530
Remove deno_crypto
package due to security vulnerability (Issue #5484)
Removes deno_crypto due to the vulnerability reported in curve25519-dalek
.
Since the router exclusively used deno_crypto
for generating UUIDs using the package's random number generator, this vulnerability had no impact on the router.
Add content-type header to failed auth checks (Issue #5496)
Adds content-type
header when returning AUTH_ERROR
from authentication service.
By @andrewmcgivery in #5497
Implement manual caching for AWS Security Token Service credentials (PR #5508)
In the AWS Security Token Service (STS), the CredentialsProvider
chain includes caching, but this functionality was missing for AssumeRoleProvider
.
This change introduces a custom CredentialsProvider
that functions as a caching layer with these rules:
- Cache Expiry: Credentials retrieved are stored in the cache based on their
credentials.expiry()
time if specified, or indefinitely (ever
) if not. - Automatic Refresh: Five minutes before cached credentials expire, an attempt is made to fetch updated credentials.
- Retry Mechanism: If credential retrieval fails, another attempt is scheduled after a one-minute interval.
- (Coming soon, not included in this change) Manual Refresh: The
CredentialsProvider
will expose arefresh_credentials()
function. This can be manually invoked, for instance, upon receiving a401
error during a subgraph call.
By @o0Ignition0o in #5508
📃 Configuration
Align entity caching configuration structure for subgraph overrides (PR #5474)
Aligns the entity cache configuration structure to the same all
/subgraphs
over...
v1.50.0-rc.0
1.50.0-rc.0
v1.49.1
Important
If you have enabled Distributed query plan caching, this release changes the hashing algorithm used for the cache keys. On account of this, you should anticipate additional cache regeneration cost when updating between these versions while the new hashing algorithm comes into service.
🔒 Security
Replace dependency included in security advisory (Issue #5484)
This removes our use of a dependency that was cited in security advisories RUSTSEC-2024-0344 and GHSA-x4gp-pqpj-f43q.
We have carefully analyzed our usages and determined that Apollo Router is not impacted. We only relied on different functions from the same dependency that were easily replaced. Despite lack of impact, we have opted to remove the dependency entirely out of an abundance of caution. This not only clears the warning on our side immediately, but also provides a clear path forward in the event that this shows up in any of our user's own scans.
Users may upgrade at their own discretion, though as it was determined there is no impact, upgrading is not being explicitly recommended.
See the corresponding GitHub issue.
🐛 Fixes
Update to Federation v2.8.1 (PR #5483)
The above security fix was in router-bridge
which had already received a Federation version bump. This bump takes Federation to v2.8.1, which fixes a performance-related matter in composition. However, it does not impact query planning, which means this particular update is a no-op and this is simply a symbolic bump of the number itself, rather than any functional change.
v1.49.1-rc.0
1.49.1-rc.0
v1.49.0
🚀 Features
Override tracing span names using custom span selectors (Issue #5261)
Adds the ability to override span names by setting the otel.name
attribute on any custom telemetry selectors .
This example changes the span name to router
:
telemetry:
instrumentation:
spans:
router:
otel.name:
static: router # Override the span name to router
Add description and units to standard instruments (PR #5407)
This PR adds description and units to standard instruments available in the router. These descriptions and units have been copy pasted directly from the OpenTelemetry semantic conventions and are needed for better integrations with APMs.
Add with_lock()
method to Extensions
to facilitate avoidance of timing issues (PR #5360)
In the case that you necessitated writing custom Rust plugins, we've introduced with_lock()
which explicitly restricts the lifetime of the Extensions
lock.
Without this method, it was too easy to run into issues interacting with the Extensions
since we would inadvertently hold locks for too long. This was a source of bugs in the router and caused a lot of tests to be flaky.
Add support for unix_ms_now
in Rhai customizations (Issue #5182)
Rhai customizations can now use the unix_ms_now()
function to obtain the current Unix timestamp in milliseconds since the Unix epoch.
For example:
fn supergraph_service(service) {
let now = unix_ms_now();
}
By @shaikatzz in #5181
🐛 Fixes
Improve error message produced when subgraphs responses don't include an expected content-type
header value (Issue #5359)
To enhance debuggability when a subgraph response lacks an expected content-type
header value, the error message now includes additional details.
Examples:
HTTP fetch failed from 'test': subgraph response contains invalid 'content-type' header value \"application/json,application/json\"; expected content-type: application/json or content-type: application/graphql-response+json
HTTP fetch failed from 'test': subgraph response does not contain 'content-type' header; expected content-type: application/json or content-type: application/graphql-response+json
By @IvanGoncharov in #5223
Performance improvements for demand control (PR #5405)
Removes unneeded logic in the hot path for our recently released public preview of demand control feature to improve performance.
By @BrynCooke in #5405
Skip hashing the entire schema on every query plan cache lookup (PR #5374)
This fixes performance issues when looking up query plans for large schemas.
Important
If you have enabled Distributed query plan caching, this release changes the hashing algorithm used for the cache keys. On account of this, you should anticipate additional cache regeneration cost when updating between these versions while the new hashing algorithm comes into service.
Optimize GraphQL instruments (PR #5375)
When processing selectors for GraphQL instruments, heap allocations should be avoided for optimal performance. This change removes Vec allocations that were previously performed per field, yielding significant performance improvements.
By @BrynCooke in #5375
Log metrics overflow as a warning rather than an error (Issue #5173)
If a metric has too high a cardinality, the following is displayed as a warning instead of an error:
OpenTelemetry metric error occurred: Metrics error: Warning: Maximum data points for metric stream exceeded/ Entry added to overflow
Add support of response_context
selectors for error conditions (PR #5288)
Provides the ability to configure custom instruments. For example:
http.server.request.timeout:
type: counter
value: unit
description: "request in timeout"
unit: request
attributes:
graphql.operation.name:
response_context: operation_name
condition:
eq:
- "request timed out"
- error: reason
Inaccurate apollo_router_opened_subscriptions
counter (PR #5363)
Fixes the apollo_router_opened_subscriptions
counter which previously only incremented. The counter now also decrements.
📃 Configuration
🛠 Maintenance
Skip GraphOS tests when Apollo key not present (PR #5362)
Some tests require APOLLO_KEY
and APOLLO_GRAPH_REF
to execute successfully.
These are now skipped if these env variables are not present allowing external contributors to the router to successfully run the entire test suite.
By @BrynCooke in #5362
📚 Documentation
Standard instrument configuration documentation for subgraphs (PR #5422)
Added documentation about standard instruments available at the subgraph service level:
http.client.request.body.size
- A histogram of request body sizes for requests handled by subgraphs.http.client.request.duration
- A histogram of request durations for requests handled by subgraphs.http.client.response.body.size
- A histogram of response body sizes for requests handled by subgraphs.
These instruments are configurable in router.yaml
:
telemetry:
instrumentation:
instruments:
subgraph:
http.client.request.body.size: true # (default false)
http.client.request.duration: true # (default false)
http.client.response.body.size: true # (default false)
Update docs frontmatter for consistency and discoverability (PR #5164)
Makes title case consistent for page titles and adds subtitles and meta-descriptions are updated for better discoverability.
By @Meschreiber in #5164
🧪 Experimental
Warm query plan cache using persisted queries on startup (Issue #5334)
Adds support for the router to use persisted queries to warm the query plan cache upon startup using a new experimental_prewarm_query_plan_cache
configuration option under persisted_queries
.
To enable:
persisted_queries:
enabled: true
experimental_prewarm_query_plan_cache: true
Apollo reporting signature enhancements (PR #5062)
Adds a new experimental configuration option to turn on some enhancements for the Apollo reporting stats report key:
- Signatures will include the full normalized form of input objects
- Signatures will include aliases
- Some small normalization improvements
This new configuration (telemetry.apollo.experimental_apollo_signature_normalization_algorithm) only works when in experimental_apollo_metrics_generation_mode: new
mode and we don't yet recommend enabling it while we continue to verify that the new functionality works as expected.
Add experimental support for sending traces to Studio via OTLP (PR #4982)
As the ecosystem around OpenTelemetry (OTel) has been expanding rapidly, we are evaluating a migration of Apollo's internal
tracing system to use an OTel-based protocol.
In the short-term, benefits include:
- A comprehensive way to visualize the router execution path in GraphOS Studio.
- Additional spans that were previously not included in Studio traces, such as query parsing, planning, execution, and more.
- Additional metadata such as subgraph fetch details, router idle / busy timing, and more.
Long-term, we see this as a strategic enhancement to consolidate these two disparate tracing systems.
This will pave the way for future enhancements to more easily plug into the Studio trace visualizer.
Configuration
This change adds a new configuration option experimental_otlp_tracing_sampler
. This can be used to send
a percentage of traces via OTLP instead...
v1.49.0-rc.1
1.49.0-rc.1
v1.49.0-rc.0
1.49.0-rc.0
v1.49.0-alpha.0
1.49.0-alpha.0