Updated: Comment #N
Problem/Motivation
The core Search module allows the use of AND (implicit) and OR (explicit) in keyword/node search. For instance, you can search for "boy dog" and it will look for nodes with both boy and dog in them; you can search for "boy OR dog" and it will find nodes with either.
This is normally working fine, but in some cases, due to the way the "first pass" of the query works, if you use the same keyword twice in the query (which is actually a valid thing to want do), the search fails even though it shouldn't.
For example, if you create a node with the following body:
I created this page with some very interesting content in it.
Here are some search results, all for queries that should find the node -- keep in mind that AND is implied unless you explicitly type OR, and that parens are not recognized:
- this
- found page
- this OR that
- found page
- this OR that this
- found page
- this OR that this OR foo
- DID NOT FIND
- this OR that this OR interesting
- found page
- this OR foo this OR interesting
- found page
- this OR foo this OR foo
- found page
All of these are correct except the "this OR that this OR foo", which is not working.
Proposed resolution
Fix the SearchQuery class so that these searches all work. See comment #3 for an explanation of what the root cause of the bug is -- basically that the "first pass" query is incorrectly saying that the search doesn't work and aborting the search. The solution is to define a search with multiple OR clauses as "not simple" so that it will fall through to doing the full query (which does work correctly).
Remaining tasks
Get the patch reviewed and committed (it works and has tests).
User interface changes
None, except that more searches will work correctly.
API changes
None.
Original report by @spartlow
Search combines keywords by AND except where adjacent keywords explicitly use OR. So this search:
match1 OR miss1 match2 OR miss2
is treated as:
(match1 OR miss1) AND (match2 OR miss2)
and will return a node that has match1 and match2 in it (but happens to not have miss1 or miss2 in it).
However, if the same exact keyword is used in two OR statements such as:
match1 OR miss1 match1 OR miss2
then that same node is NOT returned. This is the bug.
Interestingly the following search WILL find that node. It's just when an additional OR is added that it fails.
match1 OR miss1 match1
Background:
I'd like to use search hooks to conditionally add certain keywords to searches. Theoretically, I'd want to do (user_key1 user_key2) OR (my_key). So instead I have to add "OR my_key" to each of the user's search keywords, which I suspect would work if it wasn't for this bug. I'm still looking for a work-around.