Skip to content

Commit

Permalink
docs: eval concepts (#663)
Browse files Browse the repository at this point in the history
  • Loading branch information
madams0013 authored Feb 5, 2025
1 parent 317481b commit 0ca085b
Show file tree
Hide file tree
Showing 4 changed files with 6 additions and 2 deletions.
8 changes: 6 additions & 2 deletions docs/evaluation/concepts/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -139,11 +139,15 @@ Learn [how run pairwise evaluations](/evaluation/how_to_guides/evaluate_pairwise

Each time we evaluate an application on a dataset, we are conducting an experiment.
An experiment contains the results of running a specific version of your application on the dataset.
To understand how to use the LangSmith experiment view, see [how to analyze experiment results](/evaluation/how_to_guides/analyze_single_experiment).

![Experiment view](./static/experiment_view.png)

Typically, we will run multiple experiments on a given dataset, testing different configurations of our application (e.g., different prompts or LLMs).
In LangSmith, you can easily view all the experiments associated with your dataset.
Additionally, you can [compare multiple experiments in a comparison view](/evaluation/how_to_guides/compare_experiment_results).

![Example](./static/comparing_multiple_experiments.png)
![Comparison view](./static/comparison_view.png)

## Annotation queues

Expand Down Expand Up @@ -191,7 +195,7 @@ Often these are triggered when you are making app updates (e.g. updating models
LangSmith's comparison view has native support for regression testing, allowing you to quickly see examples that have changed relative to the baseline.
Regressions are highlighted red, improvements green.

![Regression](./static/regression.png)
![Comparison view](./static/comparison_view.png)

### Backtesting

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed docs/evaluation/concepts/static/regression.png
Binary file not shown.

0 comments on commit 0ca085b

Please sign in to comment.