Name	Name	Last commit message	Last commit date
parent directory ..
application	application
notebooks	notebooks
sample-data	sample-data
scripts	scripts
README.md	README.md

Vespa Product Ranking

This sample application is used to demonstrate how to improve Product Search with Learning to Rank (LTR).

Blog post series:

Improving Product Search with Learning to Rank - part one introduces the dataset used in this sample application and several baseline ranking models.
Improving Product Search with Learning to Rank - part two demonstrates how to train neural methods for search ranking. The neural training routine is found in Learning to rank with Transformer models .
Improving Product Search with Learning to Rank - part three shows how to train GBDT methods for search ranking. The model uses also neural signals as features. See notebooks:
- XGBoost
- LightGBM

This work uses the largest product relevance dataset released by Amazon:

We introduce the “Shopping Queries Data Set”, a large dataset of difficult search queries, released with the aim of fostering research in the area of semantic matching of queries and products. For each query, the dataset provides a list of up to 40 potentially relevant results, together with ESCI relevance judgements (Exact, Substitute, Complement, Irrelevant) indicating the relevance of the product to the query. Each query-product pair is accompanied by additional information. The dataset is multilingual, as it contains queries in English, Japanese, and Spanish.

The dataset is found at amazon-science/esci-data. The dataset is released under the Apache 2.0 license.

Quick start

The following is a quick start recipe on how to get started with this application.

Docker Desktop installed and running. 6 GB available memory for Docker is recommended. Refer to Docker memory for details and troubleshooting
Alternatively, deploy using Vespa Cloud
Operating system: Linux, macOS or Windows 10 Pro (Docker requirement)
Architecture: x86_64 or arm64
Homebrew to install Vespa CLI, or download a vespa cli release from GitHub releases.
zstd: brew install zstd
Either, python3 with pyvespa pyarrow and pandas installed, or uv

Validate Docker resource settings, should be minimum 6 GB:

$ docker info | grep "Total Memory"
or
$ podman info | grep "memTotal"

Install Vespa CLI:

$ brew install vespa-cli

For local deployment using docker image:

$ vespa config set target local

Pull and start the vespa docker container image:

$ docker pull vespaengine/vespa
$ docker run --detach --name vespa --hostname vespa-container \
  --publish 127.0.0.1:8080:8080 --publish 127.0.0.1:19071:19071 \
  vespaengine/vespa

Verify that configuration service (deploy api) is ready:

$ vespa status deploy --wait 300

Download this sample application:

$ vespa clone commerce-product-ranking my-app && cd my-app

Download cross-encoder model:

$ curl -L -o application/models/title_ranker.onnx \
    https://data.vespa-cloud.com/sample-apps-data/title_ranker.onnx

See scripts/export-bi-encoder.py and scripts/export-cross-encoder.py for how to export models from PyTorch to ONNX format.

Deploy the application:

$ vespa deploy --wait 600 application

If the above fails, check the logs:

$ docker logs vespa

Deployment note

It is possible to deploy this app to Vespa Cloud.

Run basic system test

This step is optional, but it indexes two documents and runs a query test

$ (cd application; vespa test tests/system-test/feed-and-search-test.json)

Indexing sample product data

Download the pre-processed sample product data for 16 products:

$ zstdcat sample-data/sample-products.jsonl.zstd | vespa feed -

Evaluation

Evaluate the semantic-title rank profile using the evaluation script (scripts/evaluate.py).

Install requirements

$ pip3 install pandas pyarrow pyvespa>=0.53.0

With the dependencies installed, we can evaluate the ranking model using the evaluation script:

$ python3 scripts/evaluate.py --endpoint http://localhost \
  --example_file sample-data/test-sample.parquet \
  --ranking semantic-title \
  --qrel_file https://data.vespa-cloud.com/sample-apps-data/test.qrels > results.txt

evaluate.py runs all the queries in the test split using the --ranking <rank-profile> and prints the NDCG score (and search time statistics).

Note that the evaluation script uses custom NDCG label gains:

Label 1 is Irrelevant with 0 gain
Label 2 is Supplement with 0.01 gain
Label 3 is Complement with 0.1 gain
Label 4 is Exact with 1 gain

$ cat results.txt

Example ranking produced by Vespa using the semantic-title rank-profile for query 535:

B08PB9TTKT 1 0.4638 B00B4PJC9K 2 0.4314 B0051GN8JI 3 0.4199 B084TV3C1B 4 0.4177 B08NVQ8MZX 5 0.4175 B00DHUA9VA 6 0.4155 B08SHMLP5S 7 0.4151 B08VSJGP1N 8 0.4147 B08QGZMCYQ 9 0.4110 B0007KPRIS 10 0.4073 B08VJ66CNL 11 0.4040 B000J1HDWI 12 0.4035 B0007KPS3C 13 0.3977 B0072LFB68 14 0.3933 B01M0SFMIH 15 0.3920 B0742BZXC2 16 0.3778

This particular product ranking for the query produces a NDCG score of 0.7046. Note that the sample-data/test-sample.parquet file only contains one query. To get the overall score, one must compute all the NDCG scores of all queries in the test split and report the average NDCG score.

We can also try another ranking model:

$ python3 scripts/evaluate.py \
  --endpoint http://localhost \
  --example_file sample-data/test-sample.parquet \
  --ranking cross-title \
  --qrel_file https://data.vespa-cloud.com/sample-apps-data/test.qrels

Which for this query produces a NDCG score of 0.8208, better than the semantic-title model.

Shutdown and remove the Docker container

$ docker rm -f vespa

Full evaluation

Download a pre-processed feed file with all (1,215,854) products:

$  curl -L -o product-search-products.jsonl.zstd \
    https://data.vespa-cloud.com/sample-apps-data/product-search-products.jsonl.zstd

This step is resource intensive as the semantic embedding model encodes the product title and description into the dense embedding vector space.

$ zstdcat product-search-products.jsonl.zstd | vespa feed -

Evaluate the hybrid baseline rank profile using the evaluation script (scripts/evaluate.py).

$ python3 scripts/evaluate.py \
  --endpoint http://localhost \
  --example_file "https://github.com/amazon-science/esci-data/blob/main/shopping_queries_dataset/shopping_queries_dataset_examples.parquet?raw=true" \
  --ranking semantic-title
  --qrel_file https://data.vespa-cloud.com/sample-apps-data/test.qrels

For Vespa cloud deployments we need to pass certificate and the private key.

$ python3 scripts/evaluate.py \
  --endpoint https://productsearch.samples.aws-us-east-1c.perf.z.vespa-app.cloud \
  --example_file "https://github.com/amazon-science/esci-data/blob/main/shopping_queries_dataset/shopping_queries_dataset_examples.parquet?raw=true" \
  --ranking semantic-title \
  --cert <path-to-data-plane-cert.pem> \
  --key <path-to-data-plane-private-key.pem>
  --qrel_file https://data.vespa-cloud.com/sample-apps-data/test.qrels

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

commerce-product-ranking

commerce-product-ranking

README.md

Vespa Product Ranking

Quick start

Deployment note

Run basic system test

Indexing sample product data

Evaluation

Shutdown and remove the Docker container

Full evaluation

Files

commerce-product-ranking

Directory actions

More options

Directory actions

More options

Latest commit

History

commerce-product-ranking

Folders and files

parent directory

README.md

Vespa Product Ranking

Quick start

Deployment note

Run basic system test

Indexing sample product data

Evaluation

Shutdown and remove the Docker container

Full evaluation