Sunspot: A Solr-Powered Search Engine for Ruby

Search can make or break your website. Sunspot and Solr give you an intuitive engine that maps directly to your Ruby objects.

As your site amasses content — be it stories, SKUs, or statistics — a tailored, effective, and exacting search engine becomes increasingly vital. Imagine a bookstore that doesn’t index its tomes by title or author, or a clothing retailer that doesn’t index garments by size. Without search, each site is useless. In general, the quality and relevance of search results makes or breaks a site.

Typically, at least some search results are generated by the site’s underlying database. A database can maintain and catalog enormous volumes of data. Thus, an inquiry for an obscure piece of content would likely fall to the database since it’s the canonical repository. Complex, multi-variate queries may also fall to the database as it’s designed specifically for the purpose.

However, a database query can be slow. Like any engineering feat, a database has strengths and weaknesses and one size rarely fits all. Hence, it’s also typical for other actors to provide search results. For instance, a content management application might use an entirely separate engine to index the prose and respond to keyword, phrase, and proximate searches.

Of course, it’s also de rigueur for an application to maintain one or more RAM-resident indices to speed common lookups and preclude repetitive computations. Additionally, software such as memcached provides surrogate memory extending across multiple machines.

Each technique—the (relational or object) database, special-purpose engine, and in-memory data structures—is a viable option; whether one, some, or all of the approaches are valid depends on the application at hand.

You are no doubt familiar with MySQL, SQLite, Oracle, and any number of other database packages. Each is capable, and there are scores of strategies to tune performance, from writing efficient queries, to tuning the database host’s I/O subsystem. You are also likely familiar with memcached, Squid, and other helpful proxies (both figurative and literal) that offload work from the database and application server. You can find plenty of books on the latter subjects.

And what about the specialized search engine? Certainly, there are plenty of commercial text indexers (FAST, Documentum) and even a few open source solutions (Zebra, Lucene, Sphinx), but what if you need to index 350,000 auto parts, 50,000 email messages, or 4,000 grocery items?

If you code in Ruby, Sunspot is an ideal solution for any of those inventories. Created by Mat Brown, Sunspot is an intuitive and expressive domain specific-language for indexing and searching Ruby objects. Sunspot is powered by Solr, the open source enterprise search server based on Lucene. Solr can highlight matches, replicate its index across many servers, shard indices, and, for the purposes of this discussion, perform advanced full-text and faceted searches.

Wanted: Indexing Engine. Apply Within.

To search your data with Sunspot, you create an index for one or more classes, populate the index, and then search using any of the indexed fields. Here’s an example use of Sunspot to index books for sale.

class Book
  # A book includes instance variables for
  # the author, a title, a publisher, an edition, a 10- and 13-digit
  # ISBN number, a blurb, a publication date, and a price.
end

To create an index, call the class method Sunspot.setup(class) block, where class is the name of the Ruby class to catalog, and block is a list of Sunspot declarations (part of Sunspot’s domain-specific language) to describe how Solr should treat one, some, or all class attributes.

Sunspot.setup(Book) do
  text    :author
  text    :blurb
  integer :edition
  string  :isbn10, :isbn13
  float   :price
  time    :published_at
  text    :publisher
  string  :sort_title do
    title.downcase.sub(/^(an?|the)\W+/, '') if title = self.title
  end
  text    :title
end

float and time and the others are Sunspot DSL keywords to define Solr types. Most of the type keywords are eponymous, but string and text require a little clarification. Use the former for values (like a UPC code) where full-text indexing doesn’t make sense; use the latter for full-text fields.

The code above builds an index for all attributes of a Book, since a consumer might want to search or sort on any of those values. The code also adds a virtual attribute, :sort_title, to help improve the readability of search results. You can search the entire text of :title and sort results either by :title or :sort_title.

The next step is to populate the index with data. (The previous step merely describes what fields to index, but does not catalog values). Assuming you’ve created a new collection of books in the array book_list, adding the books to the index requires just two statements.

Sunspot.index( book_list )
Sunspot.commit

The first statement adds each book in the list book_list to the in-memory Solr index. The second statement commits the new additions to the index to the Solr instance’s persistent store.

Sunspot for Ruby includes other methods to manipulate the index, too, such as remove(instances) and remove_all(classes). The former removes one or more instances from the index; a subsequent commit is required to affect the persistent index. (Optionally, you can call remove!(instance) to delete and commit in one fell swoop.) The latter method removes all instances of one or multiple classes from the index, (and has an analog remove_all!(class(es)) to implicitly commit.

And, naturally, there is a method to search and provide a variety of criteria.

Sunspot.search(Book) do
  facet :publisher
  keywords 'Zaphod'
  with(:edition, 1)
  with(:price).less_than 19.99
  order_by :sort_title
end

The query above says, “Search the current index for all first edition books less than $19.99 where ‘Zaphod’ appears in a full-text field, and return the results sorted by the special sorting title.” In addition to those results, the statement facet :publisher also produces a list of publishers whose books match the criteria. You can use a facet to drill down and produce a list of matching books sold by a specific publisher.

Next: Show Me the Code

Comments on "Sunspot: A Solr-Powered Search Engine for Ruby"

adedip

is there a very to filter by score?
I mean, what if I want to retrieve only the records that meet a score higher than 0.5 ?

and btw this is the best e most detailed article I\’ve ever read about sunspot!

Reply

While many of the inventories load on the same side that you get in or you will have good luck.
But maybe we have it all — but one thing she
doesn’t want to miss out on all of the sex and age ranges. Coming-out stories sex cams are a significant part of my summer visiting quaint locales in the continental U. Two years later, when he descended the sky-track and found this fair youth taking ease in his halls, was much pleased that Phaethon had sought him out.

Here is my web blog; sexcam

Reply

the kitchen is now active again and in the soup porn is, who does not like to have to clap with Youporn, which can certainly help Issue No. 6. you can also see if you have a few pages of You Porn Watch

Reply

Hey there! This is kind of off topic but I need some guidance from an established blog.
Is it very difficult to set up your own blog?
I’m not very techincal but I can figure things out pretty fast. I’m thinking about making my
own but I’m not sure where to start. Do you have any tips or suggestions? Thank you

Here is my page: Air Conditioner

Reply

Hi, all is going nicely here and ofcourse every one is sharing facts, that’s truly good, keep up writing.

Reply

Now I am ready to do my breakfast, once having my breakfast coming again to read other news.

Here is my blog post; earn money online (Ferne)

Reply

Excellent keen synthetic attention with regard to details and may foresee complications prior to they happen.

Reply

Thank you for the good writeup. It if truth be
told was a leisure account it. Glance complex to far delivered agreeable from
you! However, how can we keep up a correspondence?

my web-site Jungle Heat Hack

Reply

Every weekend i used to pay a visit this website, for the reason that i want enjoyment, for the reason that this this
web site conations genuinely nice funny material too.

Feel free to surf to my blog :: Free Netflix Account

Reply

My family every time say that I am wasting my time here at
net, but I know I am getting knowledge all the time by reading
thes fastidious posts.

Here is my web site :: Clash Of Clans Hack

Reply

Hello there! I just want to give you a big thumbs up for the great information you have here on this post.
I will be returning to your web site for more soon.

Reply

Where did they move Soma?? I remember when it was by Twain H.S… Good times!!!

Reply

I loved as much as you will receive carried out right here.
The sketch is tasteful, your authored material stylish.
nonetheless, you command get bought an shakiness over that you wish be delivering the following.
unwell unquestionably come more formerly again since exactly the same nearly
a lot often inside case you shield this hike.

Reply

What’s up it’s me, I am also visiting this website daily, this website is
genuinely fastidious and the users are actually sharing nice thoughts.

Reply

I just want to mention I am just newbie to weblog and actually liked your web page. Very likely I’m likely to bookmark your site . You definitely come with outstanding articles. Regards for sharing your website.

Reply

Hello it’s me, I am also visiting this web site daily,
this web site iss genuinely good annd the viisitors arre actually sharing good thoughts.

Stop by mmy site :: download from youtube

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>