Gallery3's search function

tempg

Joined: 2005-12-17
Posts: 1633
Posted: Sat, 2010-11-27 22:18

I looked around a little, but I'm not sure what runs the search function (e.g. Gallery custom script somewhere or Kohana script somewhere).

I'm noticing that Gallery3's search doesn't return partial words.
If, for example, you search for "lion" Gallery will return photos tagged "lion," but not photos tagged "lions."
Similarly, if searching for "key," Gallery doesn't return results with photos tagged "keyword"--unless the exact term "key" appears elsewhere for that photo (e.g. in the title).

Is this by design? Have I overlooked a setting to adjust this?

 
bharat
bharat's picture

Joined: 2002-05-21
Posts: 7985
Posted: Sat, 2010-11-27 23:02

You might find http://gallery.menalto.com/node/99240 interesting.

We're using MySQL FullText search: http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html
There may be ways to configure it to do what you want-- if you discover any options that we can control in PHP let me know and I can work them into the code.
---
Problems? Check gallery3/var/logs
bugs/feature req's | upgrade to the latest code | use git

 
tempg

Joined: 2005-12-17
Posts: 1633
Posted: Sun, 2010-11-28 00:20

Staring point:

I'd have to look at it some more (obviously) but it looks like the search terms could be isolated and duplicated.

For example, consider a search for: go home
The terms needed would be '+go +home go* home*' and should return 'go home,' 'going home,' 'goes home,' 'golden homes,' etc.

One consideration would be doing a check to see if any of the search term ends in an 's' and, if so, removing it/them prior to the search.

(I think that would be okay in all instances. If searching for 'cherries' it would end up searching for 'cherrie*' but I don't think that's a problem. I'll dig deeper, but I doubt there's a solution that will return 'cherry' for 'cherries' or 'go' for 'going,' nor do I think that would be useful enough for most users to validate the effort of implementing it.)

If an exact phrase is needed (e.g. 'go home' and nothing else), user could just use quotes, as is standard on all major search engines.

I have to do more looking to see how this would affect the relative weights/scores of the results.
It looks like Gallery never really bothers the search string. For this to work, the string would have to be parsed for spaces (to isolate terms) and reprocessed into a new string to pass to search_records.

I have some experience coding and still write some algorithms/pseudocode here and there, BUT: I'm more a designer than a programmer so let me know if any of that sounds like crap.

 
bharat
bharat's picture

Joined: 2002-05-21
Posts: 7985
Posted: Sun, 2010-11-28 00:50

I suspect that some simple parsing here would go a long way. If you're interested in working on this, that'd be great!
---
Problems? Check gallery3/var/logs
bugs/feature req's | upgrade to the latest code | use git

 
tempg

Joined: 2005-12-17
Posts: 1633
Posted: Sun, 2010-11-28 19:55

My first attempt at outlining what's needed (I'm not sure where to put this to try it out in my Gallery):

$terms1 = explode(" ",$orig_search,5);
$terms2 = array();

function create_new_terms($term)
{
	global $terms2;
	if ((substr($term,1) != "\"") && (substr($term,-1) != "*"))
	{
		$terms2[] = rtrim($term,'s')."*";
	}
}

array_walk($terms1, "create_new_terms");
$terms1[] = $terms2;
$terms1 = array_unique($terms1);
$final_terms = implode(" ",$terms1);

Again, I'm not exactly a programmer, but I think this gets it started.
There's a little extra funniness going on there because I'm not sure what the Gallery variables are.
$orig_search would be the search string as entered by the user.
$final_terms is the final string that will actually be used.
I'm also not at all sure about the format of the if statement starting on line 8; the goal is to ignore a search term if it's in quotes and/or it already ends in a wildchar.

 
bharat
bharat's picture

Joined: 2002-05-21
Posts: 7985
Posted: Sun, 2010-11-28 21:39

Try using it to modify the $q variable in the search() function in modules/search/helpers/search.php and see if that works
---
Problems? Check gallery3/var/logs
bugs/feature req's | upgrade to the latest code | use git

 
tempg

Joined: 2005-12-17
Posts: 1633
Posted: Fri, 2011-03-25 14:05

Okay, been a while, but my version of the search function is below.

It's not a wonderchild so searching for "goose" will not show results for "geese," but searching for "tables" should return results for "table" and searching for "table" should also return results for "tables," "tabletop," etc.

The search starts at the beginning of words, so searching for "end" will also return "ends," but not "bend." I did that on purpose because I'd otherwise have far too many results in my gallery.

Any exact matches will appear first, so, in the above example, results for "end" should appear before results for "ends" or "ending." [EDITED to correct the example]

I'm sure there's somewhere else I'm supposed to upload this, but I don't know where. Tell me where and I'll upload it there later today.

class Search_Controller extends Controller {
  public function index() {
    $page_size = module::get_var("gallery", "page_size", 9);
    $q = Input::instance()->get("q");
    $page = Input::instance()->get("page", 1);
    $offset = ($page - 1) * $page_size;
    
    $m1 = explode(" ",$q,5);
	
	function crnete($t1) {
		if ((substr($t1,0,1) != "\"") && (substr($t1,-1,1) != "*"))
			{ $t1 = rtrim($t1,'s')."*"; return $t1; }}
	
	function cotems($z1,$z2) { return $z1." ".$z2; }
	
	array_push($m1,implode(" ",array_map("crnete", $m1)));
	$q2 = array_reduce($m1,"cotems");
    
    // Make sure that the page references a valid offset
    if ($page < 1) {
      $page = 1;
    }

    list ($count, $result) = search::search($q2, $page_size, $offset);

    $max_pages = max(ceil($count / $page_size), 1);

    $template = new Theme_View("page.html", "collection", "search");
    $template->set_global("page", $page);
    $template->set_global("max_pages", $max_pages);
    $template->set_global("page_size", $page_size);
    $template->set_global("children_count", $count);

    $template->content = new View("search.html");
    $template->content->items = $result;
    $template->content->q = $q;

    print $template;
  }
}
 
tempg

Joined: 2005-12-17
Posts: 1633
Posted: Thu, 2011-03-24 17:55

Quick notes:
(1) This file replaces the current search module's controller search.php file.
(2) I purposely set it to only repurpose 5 of the search terms (which results in 10 search terms); otherwise the list of terms could get excessive if someone enters a long search phrase instead of just a word or two. Basically, the first four words are treated and repurposed and all remaining words are treated as one long term. This means that if "blue" is the 5th word, results won't be included for "blues" (unless that happens to be the last word). I don't think this would be much of an issue though.

 
bharat
bharat's picture

Joined: 2002-05-21
Posts: 7985
Posted: Sun, 2011-03-27 18:13

Ok I've reworked this a bit to make it fit our style guide and tighten it up. Here's what it looks like now:

in modules/search/helpers/search.php add this function:

  /**                                                                                                        
   * Add more terms to the query by wildcarding the stem value of the first                                  
   * few terms in the query.                                                                                 
   */
  static function add_query_terms($q) {
    $MAX_TERMS = 5;
    $terms = explode(" ", $q, $MAX_TERMS);
    for ($i = 0; $i < min(count($terms), $MAX_TERMS - 1); $i++) {
      // Don't wildcard quoted or already wildcarded terms                                                   
      if ((substr($terms[$i], 0, 1) != '"') && (substr($terms[$i], -1, 1) != "*")) {
        $terms[] = rtrim($terms[$i], "s") . "*";
      }
    }
    return implode(" ", $terms);
  }

in modules/search/controllers/search.php:

    $q = search::add_query_terms($q);
    list ($count, $result) = search::search($q, $page_size, $offset);

I'm happy to submit this on your behalf, or if you'd like you can create a fork of gallery3 on github and submit it yourself and I'll pull it in to the mainline (then this change will go in under your name and you'll get the credit for your work). Let me know what you'd like to do.
---
Problems? Check gallery3/var/logs
file a bug/feature ticket | upgrade to the latest code! | hacking G3? join us on IRC!

 
tempg

Joined: 2005-12-17
Posts: 1633
Posted: Sun, 2011-03-27 19:25

@bharat: Thanks!! The only difference when using your code is the list of terms that the loaded results displays. This might not matter for most users, but I like it better when the page only shows the user what they submitted. (For example, when searching for "neighbor", the results page now shows the search as "neighbor neighbor*"; again, doesn't affect the results, just what the user sees, and most gallery users probably won't care.

Submission: I'm not sure how github works; not sure how to create a fork. I wouldn't mind submitting it (and I plan on more submissions later so I eventually may need to just figure it out). Looks like I need to install some software first? I'm reading up on help.github.com but, in the meantime, I don't mind if you want to just get it done and submit it so that it's ready for the next release and doesn't get forgotten.

 
bharat
bharat's picture

Joined: 2002-05-21
Posts: 7985
Posted: Sun, 2011-03-27 20:16

This guide might help on the git side:
http://codex.gallery2.org/Gallery:Using_Git:Advanced_Topics

Feel free to edit it as you learn stuff. Essentially you'll have your own parallel (forked) version of Gallery 3 and we'll be able to see your changes and you'll have an easy way to keep it in sync with the main line. When you make changes to your fork, we'll be able to just click a button and pull them into the main line. It's pretty useful.

I agree with you about showing the expanded term set. I've tweaked it so that doesn't show. I'll go ahead and submit this for now, let me know if you need help getting your forked repo up!
---
Problems? Check gallery3/var/logs
file a bug/feature ticket | upgrade to the latest code! | hacking G3? join us on IRC!