Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

search: enable AND operator and cross_field #916

Merged
merged 1 commit into from
Oct 22, 2024

Conversation

@ntarocco ntarocco requested a review from sakshamarora1 August 30, 2024 16:13
Copy link
Contributor

@sakshamarora1 sakshamarora1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original example mentioned in the inline comment wasn't the complete case. I updated with the correct explanation according to my testing and observations.

Essentially, we can not use cross_fields with AND operator as the operator forces both the terms to be in the same field which means it ignores the cross_fields type.

If we remove the AND operator, it works but will increase the number of results that would loosely match due to the default OR.

The best way I was able to find was to inject "AND" keyword between each word, this seems to solve the problem completely but it is not an elegant solution.

@zzacharo
Copy link
Contributor

Here some tests in devtools of opensearch (from @ntarocco ):

# 0s8wv-bxz77
GET /documents/_search
{
  "profile": true,
  "query": {
    "query_string": {
      "query": "quantum AND schwartz",
      "fields": ["title", "authors.full_name"]
    }
  }
}

GET /documents/_search
{
  "profile": false,
  "query": {
    "query_string": {
      "query": "quantum schwartz",
      "type": "cross_fields",
      "default_operator": "AND"
    }
  }
}

GET /documents/_search
{
  "profile": true,
  "query": {
    "query_string": {
      "query": "quantum field theory schwartz",
      "type": "cross_fields",
      "default_operator": "AND",
      "fields": ["title", "authors.full_name"]
    }
  }
}

GET /documents/_search
{
  "profile": false,
  "query": {
    "bool": {
      "must": [
        {"query_string": {
      "query": "field schwartz",
      "type": "cross_fields",
      "default_operator": "AND",
      "fields": ["title", "*"]
    }},
    {"term": {
      "pid": {
        "value": "0s8wv-bxz77"
      }
    }}
    
      ]
    }
    
  }
}

@zzacharo
Copy link
Contributor

Working query

GET /documents/_search 
{
  "profile": true,
  "query": {
    "query_string": {
      "query": "quantum schwartz",
      "type": "cross_fields",
      "default_operator": "AND",
    "analyzer": "standard"
    }
  }
}

Findings

Cross field queries: https://www.elastic.co/guide/en/elasticsearch/guide/current/_cross_fields_queries.html

For the cross_fields query type to work optimally, all fields should have the same analyzer. Fields that share an analyzer are grouped together as blended fields.
If you include fields with a different analysis chain, they will be added to the query in the same way as for best_fields. For instance, if we added the title field to the preceding query (assuming it uses a different analyzer), the explanation would be as follows:
(+title:peter +title:smith)

It seems that fields title and authors.full_name use a different analyzer. Forcing the analyzer on the query level as above seems to make the cross_fields type work as expected.

cds_ils/config.py Outdated Show resolved Hide resolved
@ntarocco ntarocco force-pushed the fix-search-and-operator branch from 52ca273 to 56df0af Compare October 21, 2024 19:34
@ntarocco ntarocco marked this pull request as ready for review October 21, 2024 19:34
@ntarocco ntarocco changed the title wip: fix me search: enable AND operator and cross_field Oct 21, 2024
@ntarocco ntarocco force-pushed the fix-search-and-operator branch 2 times, most recently from 887f039 to 7f1e695 Compare October 22, 2024 09:23
- enable searching in multiple fields with AND operator
@ntarocco ntarocco force-pushed the fix-search-and-operator branch from 7f1e695 to 0df6a5a Compare October 22, 2024 09:39
@ntarocco ntarocco merged commit c28af7b into master Oct 22, 2024
2 checks passed
@ntarocco ntarocco deleted the fix-search-and-operator branch October 22, 2024 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: To review
Development

Successfully merging this pull request may close these issues.

4 participants