-
Notifications
You must be signed in to change notification settings - Fork 49
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into harrison/tracing-tutorial
- Loading branch information
Showing
100 changed files
with
3,148 additions
and
160 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"editor.trimAutoWhitespace": false, | ||
"files.trimTrailingWhitespaceInRegexAndStrings": false | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
--- | ||
sidebar_label: Manage Datasets | ||
sidebar_position: 4 | ||
sidebar_position: 5 | ||
--- | ||
|
||
import { | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
--- | ||
sidebar_label: Regression Testing | ||
sidebar_position: 3 | ||
--- | ||
|
||
# Regression Testing | ||
|
||
When evaluating LLM applications, it is important to be able to track how your system performs over time. In this guide, we will show you how to use LangSmith's comparison view in | ||
order to track regressions in your application, and drill down to inspect the specific runs that improved/regressed over time. | ||
|
||
## Overview | ||
|
||
In the LangSmith comparison view, runs that _regressed_ on your specified feedback key against your baseline experiment will be highlighted in red, while runs that _improved_ | ||
will be highlighted in green. At the top of each column, you can see how many runs in that experiment did better and and how many did worse than your baseline experiment. | ||
|
||
![Regressions](../static/regression_view.png) | ||
|
||
## Baseline Experiment | ||
|
||
In order to track regressions, you need a baseline experiment against which to compare. This will be automatically assigned as the first experiment in your comparison, but you can | ||
change it from the dropdown at the top of the page. | ||
|
||
![Baseline](../static/select_baseline.png) | ||
|
||
## Select Feedback Key | ||
|
||
You will also want to select the feedback key on which you would like focus. This can be selected via another dropdown at the top. Again, one will be assigned by | ||
default, but you can adjust as needed. | ||
|
||
![Feedback](../static/select_feedback.png) | ||
|
||
## Filter to Regressions or Improvements | ||
|
||
Click on the regressions or improvements buttons on the top of each column to filter to the runs that regressed or improved in that specific experiment. | ||
|
||
![Regressions Filter](../static/filter_to_regressions.png) | ||
|
||
## Try it out | ||
|
||
To get started with regression testing, try [running a no-code experiment in our prompt playground](experiments-app) or check out the [Evaluation Quick Start Guide](/evaluation/quickstart) to get started with the SDK. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
--- | ||
sidebar_label: Unit Test | ||
sidebar_position: 3 | ||
sidebar_position: 4 | ||
--- | ||
|
||
# Unit Tests | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
--- | ||
sidebar_label: Version Datasets | ||
sidebar_position: 5 | ||
sidebar_position: 6 | ||
--- | ||
|
||
# How to version datasets | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.