Search not matching words with apostrophes
Hi Low,
I know this topic has been hit on a few times but I am still struggling to get the results I require.
I have a lot of names that have apostrophes in them such as O'Connelly and O'Donovan. The issue is, as you have previously stated, the plugin filters this as O and Connelly.
I have set the channels in the collection tag and they appear in certain orders using if low_search_collection_name == "collection_name_one"
output
if:elseif low_search_collection_name == "collection_name_two"
output
When I search for connelly alone the entries work perfectly and the entries related to my specific channel appear on top.
if low_search_collection_name == "collection_name_one"
o'connelly output
if:elseif low_search_collection_name == "collection_name_two"
o'connelly output
But when I search for O'Connelly a lot of the channels do not show.
if low_search_collection_name == "collection_name_one"
no output
if:elseif low_search_collection_name == "collection_name_two"
o'connelly output
I tried changing search modes and loose ends but to no avail, also in collection_name_one, O'Connelly is the title, whereas in other areas it can be the title or part of a relationship field, which makes me think the low search is fine with relationships but struggling with the title field.
If you could shed any light it would be very much appreciated.
Cheers
Chris
Replies
Low 27 Aug 2014 15:32
Terms like "O'Connolly" will be transformed to "o connolly", in both the entry index as given keywords. Non-alphanumeric characters are replaced by spaces, both in the index as in search terms. This goes for apostrophes: ' but also for curly ones: ‘ and ’.
So, effectively, using "O'Connolly" as a search term, will result in 2 keywords: "o" and "connolly", so search_mode is important (should be "all" here, because you don't want to be looking for just "o" as the only keyword).
Whether the term is present in the title or another field is irrelevant, except when the field's weight is different.
You can check the exp_low_search_indexes table to see how the terms are put in the database.
FWIW: I'm thinking of adding a setting for characters that should be removed entirely, rather than replaced by a string, which might help the issue as well. Except when searching for "O Connolly", which will then return no results... As you can expect, there's always pros and cons to each solution.
verydisco 28 Aug 2014 10:12
Hi Low,
I had the search_mode set to all but had no joy with that, it still doesn't pull O'Connelly entries from my top channel/collection.
Not to sure how to fix this one, even if I created a separate input that would save the title O'Connelly as OConnelly and then tweak the plugin to instead merge apostrophes rather than create a space ie OConnelly instead of O Connelly, I run the risk of screwing it up for other terms down the line.
Low 28 Aug 2014 10:31
I'll need to take a look myself. Please send SuperAdmin login credentials to hi at gotolow dot com if you can. I'll see if I can reproduce and trace down what's happening.
Low 29 Aug 2014 09:16
From what I can see, searching for "Connell" and searching for "O'Connell" both result in the same amount of search results. However, the latter will trigger the alternative search method (using LIKE) instead of the preferred full-text method (using MATCH/AGAINST), because the word "o" falls below the full-text word length threshold. The LIKE search uses a custom scoring mechanism as well, which scores results differently than the full-text search (which is a black box).
This means other entries are scored higher than with the full-text search. But the amount of entries is the same, so it only looks like there are entries missing.
If course, this would be avoided if the apostrophes would be replaced by an empty string, converting it to "oconnell", which doesn't fall below the threshold, and would trigger the full-text search, including it's scoring algorithm. With the down side that searching for just "connell" would return no results, of course...
verydisco 29 Aug 2014 11:30
I see what you mean, I thought specifying {if low_search_collection_name == "channel/collection"} it would trump the default order of importance. Which is not the case as you pointed out. Is there a way to somehow ensure that if any result with the specified channel name ranks top of the results?
I would prefer not to edit so a search for 'connelly' to show up empty and my client just prefers this particular section to be on top for all results.
Low 29 Aug 2014 11:53
There is currently no way to force a given collection to the top of the results in a single sequence. In your case, you might get better results if the gallery entries weren't included at all.
verydisco 29 Aug 2014 13:45
Bad times to hear, but you did give me an idea, that got it working-ish.
I took my channel out of the collection string and gave it its own. With an include if statement to account for pagination.
Basically,
{if segment_4==""}
{exp:low_search:results query="{segment_3}" collection="channel_one" loose_ends="both" search_mode="all" limit="20"}
Thing I want to show first
{/exp:low_search:results}
{/if}
{exp:low_search:results query="{segment_3}" collection="channel_two|channel_three|channel_four" loose_ends="both" search_mode="all" limit="20"}
Thing I want
{/exp:low_search:results}
Granted it comes with it's own issues but that channel_one doesn't have that many entries plus the search will hopefully never pull that many. The if statement stops it showing in the paginated pages.
Swings and roundabouts I guess.