Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc: Add tutorial for filter-elastic_integration #15932

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
186 changes: 186 additions & 0 deletions docs/static/ea-integration-tutorial.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
[[ea-integrations-tutorial]]
=== Tutorial: Using the {ls} `elastic_integration filter` to extend Elastic {integrations}
++++
<titleabbrev>Tutorial: {ls} `elastic_integration filter`</titleabbrev>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<titleabbrev>Tutorial: {ls} `elastic_integration filter`</titleabbrev>
<titleabbrev>Tutorial: {ls} `elastic_integration` filter</titleabbrev>

++++

You can use {ls} to transform events collected by {agent} and paired with an {integrations-docs}[Elastic integration].
You get the benefits of Elastic integrations--such as the simplicity of ingesting data from a wide variety of data
sources and ensuring compliance with the {ecs-ref}/index.html[Elastic Common Schema (ECS)]--combined with the extra
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-- isn't rendered properly?

processing power of {ls}.

This new functionality is made possible by the <<plugins-filters-elastic_integration,elastic_integration filter>> plugin.
When you include the `elastic_integration` filter in your configuration, {ls} reads certain field values generated by the {agent},
and uses them to apply the transformations from Elastic integrations.
This allows you to to further process events in the Logstash pipeline before sending them to their
configured destinations.

This tutorial walks you through adding the {integrations-docs}/crowdstrike-intro[Crowdstrike integration], using {ls} to
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@flexitrev, would you like to highlight a different integration for this tutorial and use case?

remove the `_version` field, and then sending the data to {ess} or self-managed {es}.
Comment on lines +18 to +19
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that removing the _version field was a workaround for the specific ingest pipeline setting the _version field on ingest document instead of on the ingest document's metadata. When it gets put on the metadata, our filter correctly propagates it to the right places on the resulting event so that downstream ES output can choose to use it or not depending on its configuration.

My worry here is that people will copy/paste this config and assume that it is necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToDo: Come up with a better example that won't mislead users



[[ea-integrations-prereqs]]
==== Prerequisites

You need:

* A working {es} cluster
* A {ls} instance
* {fleet-server}
* An {fleet-guide}/elastic-agent-installation.html[{agent} installed] on the hosts you want to collect data from, and configured to {fleet-guide}/logstash-output.html[send output to {ls}]
* An active Elastic Enterprise https://www.elastic.co/subscriptions[subscription]
* A user configured with the <<plugins-filters-elastic_integration-minimum_required_privileges,minimum required privileges>>

NOTE: Even though the focus of this tutorial is {Fleet}-managed agents, you can use the `elastic_integration` filter and this
general approach with {fleet-guide}/elastic-agent-configuration.html[self-managed agents].


[[ea-integrations-process-overview]]
==== Process overview

* <<ea-integrations-fleet>>
* <<ea-integrations-create-policy>>
* <<ea-integrations-pipeline>>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to setup LS instance before the LS policy? - I haven't fully followed the steps, not sure what experience I get when setting the policy first.


[discrete]
[[ea-integrations-fleet]]
=== Configure {fleet} to send data from {agent} to {ls}

. For {fleet}-managed agents, go to {kib} and navigate to *Fleet > Settings*.

. Create a new output and specify {ls} as the output type.

. Add the {ls} hosts (domain or IP address/s) that the {agent} should send data to.

. Add the client SSL certificate and the Client SSL certificate key to the configuration.

. Click *Save and apply settings* in the bottom right-hand corner of the page.

[discrete]
[[ea-integrations-create-policy]]
=== Create an {agent} policy with the necessary integrations

. In {kib} navigate to *Fleet > Agent* policies, and select *Create agent policy*.

. Give this policy a name, and then select *Advanced options*.

. Change the *Output for integrations* setting to the {ls} output you created.

. Click *Create agent policy*.

. Select the policy name, and click *Add integration*.
+
This step takes you to the integrations browser, where you can select an integration that has everything
necessary to _integrate_ the data source with your other data in the {stack}.
We'll use Crowdstrike as our example in this tutorial.

. On the *Crowdstrike* integration overview page, click *Add Crowdstrike* to configure the integration.

. Configure the integration to collect the data you need.
On step 2 at the bottom of the page (*Where to add this integration?*), make sure that the “Existing hosts” option
is selected and the Agent policy selected is the Logstash policy we created for our Logstash output.
This policy should be selected by default.

. Click *Save and continue*.
+
You have the option to add the {agent} to your hosts.
If you haven't already, {fleet-guide}/elastic-agent-installation.html[install the {agent}] on the host where you want to collect data.


[discrete]
[[ea-integrations-pipeline]]
=== Configure {ls} to use the `elastic_integration` filter plugin

. Create a new {logstash-ref}/configuration.html[{ls} pipeline].
. Be sure to include these plugins:

* <<plugins-inputs-elastic_agent,`elastic_agent` input>>
* <<plugins-filters-elastic_integration,`elastic_integration` filter>>
* <<plugins-outputs-elasticsearch,`elasticsearch` output>>

Note that every event sent from the {agent} to {ls} contains specific meta-fields.
{ls} expects events to contain a top-level `data_stream` field with `type`, `dataset`, and `namespace` sub-fields.

{ls} uses this information and its connection to {es} to determine which integrations to apply to the event before sending the event to its destination output.
{ls} frequently synchronizes with {es} to ensure that it has the most recent versions of the enabled integrations.


[discrete]
[[ea-integrations-ess-sample]]
==== Sample configuration: output to {ess}

This sample illustrates using the `elastic_agent` input and the `elastic_integration` filter for processing in {ls}, and then sending the output to {ess}.

Check out the <<plugins-filters-elastic_integration,elastic_integration filter plugin docs>> for the full list of configuration options.

[source,txt]
-----
input {
elastic_agent { port => 5055 }
}

filter {
elastic_integration {
cloud_id => "your-cloud:id"
api_key => "api-key"
remove_field => ["_version"]
}
}

output {
stdout {}
elasticsearch {
cloud_auth => "elastic:<pwd>"
cloud_id => "your-cloud-id"
}
}
-----

All processing occurs in {ls} before events are forwarded to {ess}.

[discrete]
[[ea-integrations-es-sample]]
==== Sample configuration: output to self-managed {es}

This sample illustrates using the `elastic_agent` input and the `elastic_integration` filter for processing in {ls}, and then sending the output to {es}.

Check out the <<plugins-filters-elastic_integration,elastic_integration filter plugin docs>> for the full list of configuration options.

Check out <<plugins-filters-elastic_integration-minimum_required_privileges>> for more info.

[source,txt]
-----
input {
elastic_agent { port => 5055 }
}

filter {
elastic_integration {
hosts => "{es-host}:9200"
ssl_enabled => true
ssl_certificate_authorities => "/usr/share/logstash/config/certs/ca-cert.pem"
username => "elastic"
password => "changeme"
remove_field => ["_version"]
}
}

output {
stdout {
codec => rubydebug # to debug datastream inputs
}
## add elasticsearch
elasticsearch {
hosts => "{es-host}:9200"
password => "changeme"
user => "elastic"
ssl_certificate_authorities => "/usr/share/logstash/config/certs/ca-cert.pem"
}
}
-----

Note that the user credentials that you specify in the `elastic_integration` filter must have sufficient privileges to get information about {es} and the integrations that you are using.

If your {agent} and {ls} pipeline are configured correctly, then events go to {ls} for processing before {ls} forwards them on to {es}.


11 changes: 8 additions & 3 deletions docs/static/ea-integrations.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,14 @@ You can take advantage of the extensive, built-in capabilities of Elastic {integ
[[integrations-value]]
=== Elastic {integrations}: ingesting to visualizing

https://docs.elastic.co/integrations[Elastic {integrations}] provide quick, end-to-end solutions for:
{integrations-docs}[Elastic {integrations}] provide quick, end-to-end solutions for:

* ingesting data from a variety of data sources,
* ensuring compliance with the {ecs-ref}/index.html[Elastic Common Schema (ECS)],
* getting the data into the {stack}, and
* visualizing it with purpose-built dashboards.

{integrations} are available for https://docs.elastic.co/integrations/all_integrations[popular services and platforms], such as Nginx, AWS, and MongoDB, as well as many generic input types like log files.
{integrations} are available for {integrations-docs}/all_integrations[popular services and platforms], such as Nginx, AWS, and MongoDB, as well as many generic input types like log files.
Each integration includes pre-packaged assets to help reduce the time between ingest and insights.

To see available integrations, go to the {kib} home page, and click **Add {integrations}**.
Expand Down Expand Up @@ -78,9 +78,11 @@ output { <3>
-----

<1> Use `filter-elastic_integration` as the first filter in your pipeline
<2> You can use additional filters as long as they follow `filter-elastic_integration`
<2> You can use additional filters as long as they follow `filter-elastic_integration`.
They will have access to the event as transformed by your enabled integrations.
<3> Sample config to output data to multiple destinations


[discrete]
[[es-tips]]
==== Using `filter-elastic_integration` with `output-elasticsearch`
Expand All @@ -93,3 +95,6 @@ Be sure that these features are enabled in the {logstash-ref}/plugins-outputs-el
* Set {logstash-ref}/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-ecs_compatibility[`ecs-compatibility`] to `v1` or `v8`.

Check out the {logstash-ref}/plugins-outputs-elasticsearch.html[`output-elasticsearch` plugin] docs for additional settings.


include::ea-integration-tutorial.asciidoc[]
Loading