Sunspot: A Solr-Powered Search Engine for Ruby

Search can make or break your website. Sunspot and Solr give you an intuitive engine that maps directly to your Ruby objects.

Show Me the Code

Here’s a Sunspot for Ruby ditty to play with. To use this code, you must first install and launch the development Solr server provided with Sunspot.

$ sudo gem install optiflag -v 0.6.5
$ sudo gem install solr-ruby
$ sudo gem install outoftime-sunspot --source=http://gems.github.com
$ sunspot-solr start

By default, the Sunspot Solr server listens to post 8983 and stores its index in /tmp. 8983 is usually used by production Solr servers and may conflict if you are already use Solr. However, you can change the port with the -p portnumber option. You can also change the index’s directory with -d /path/to/storage.

The sample application searches for books in inventory. In a nutshell, the application uses RAM to persist objects (for the purpose of demonstration) and uses Sunspot to make complex and very fast queries with Solr.

The first listing is the application; the second is an adapter named Memory that serves as a bridge between Solr and the application. More specifically, it adapts a vanilla Ruby object to provide methods needed to map from the application to Solr and vice versa.

require 'memory'

class Book
  # A book includes instance variables for
  # the author, a title, a publisher, an edition, a 10- and 13-digit
  # ISBN number, a blurb, a publication date, and a price.

  attr_accessor :author, :blurb, :edition,
    :isbn10, :isbn13, :price, :published_at, :publisher, :title

  def id
    self.object_id
  end

  def initialize( attrs = {} )
    attrs.each_pair { |attribute, value|
      self.send "#{attribute}=", value }
  end
end

Sunspot::Adapters::InstanceAdapter.register(
  Memory::InstanceAdapter, Book)

Sunspot::Adapters::DataAccessor.register(
  Memory::DataAccessor, Book )

Sunspot.setup(Book) do
  text    :author
  text    :blurb
  integer :edition
  string  :isbn10, :isbn13
  float   :price
  time    :published_at
  text    :publisher
  string  :sort_title do
    title.downcase.sub(/^(an?|the)\W+/, '') if title = self.title
  end
  text    :title
end

Sunspot.index( king = Book.new( {
  :author     => 'Stephen King',
  :blurb      => 'Things get really weird out West',
  :edition    => 1,
  :isbn10     => '1234567890',
  :isbn13     => 'abcdef0123456',
  :price      => 12.99,
  :published_at => Time.now,
  :publisher  => 'Random Number House',
  :title      => 'The Dark Tower' } ) )

Sunspot.index( reaper = Book.new( {
  :author     => 'Josh Bazell',
  :blurb      => 'A hitman becomes a doctor',
  :edition    => 1,
  :isbn10     => '9876543210',
  :isbn13     => 'abcdef1111111',
  :price      => 25.99,
  :published_at => Time.now,
  :publisher  => 'Knopf',
  :title      => 'Beat the Reaper' } ) )

Sunspot.commit

Sunspot.search( Book ) { keywords 'King' }.results.each {|x|
  puts x.title }

search2 = Sunspot.search( Book ) do
  with( :price ).less_than( 30 )
end

search2.results.each { |s| puts s.title } 

Sunspot.remove_all!( Book )
require 'rubygems'
require 'sunspot'

module Memory
  class InstanceAdapter < Sunspot::Adapters::InstanceAdapter
    def id
      @instance.object_id
    end
  end

  class DataAccessor < Sunspot::Adapters::DataAccessor
    def load( id )
      ObjectSpace._id2ref( id.to_i )
    end

    def load_all( ids )
      ids.map { |id| ObjectSpace._id2ref( id.to_i ) }
    end
  end
end

The first search returns The Dark Tower. The second seach returns both books.

Some comments about the code:

  • The latter listing is the contents of memory.rb.
  • The Book object is run-of-the-mill Ruby code, with one exception: one attribute must be designated as the object’s unique identifier. Here, because RAM is used as the persistent store, the ID for a Book object is Ruby’s internal object ID.
  • The statement Sunspot::Adapters::InstanceAdapter.register( Memory::InstanceAdapter, Book ) ties an adapter to Book. The adapter is a kind of callback: when a Book is added to the Solr index, the adapter provides the primary ID of the instance being indexed.
  • The statement Sunspot::Adapters::DataAccessor.register( Memory::DataAccessor, Book ) is the inverse of the previous adapter. Given an ID from the Solr search results, this adapter pulls the associated record from persistent store and creates a Ruby object.
  • The Sunspot.setup block specifies the type of each attribute. Because a consumer might like to search for an author using ony a last name, the author attribute is marked for full text indexing. (The Sunspot string type must have an exact match.)
  • The rest adds two books to the search index and performs two searches.
  • The code assigns each book to a variable to preclude the objects from being garbage-collected. Again, this is a safeguard necessary only for this demonstration. Further, because this application terminates and frees all its memory, the final statement purges all data from the index, lest the index be full of dangling pointers. If you tinker with this demo code and receive an error such as 0x91d00c is not id value (RangeError), your index is stale and you should purge it and begin again. This is achieved easily: copy the last line of the application immediately after setup(...).

Again, memory is a (very) atypical persistent store and used for this demo for brevity. Typically, your data is stored in a database; the ID stored in the search engine is the row ID of the record; and the adapter to pull the data is a fetch. Indeed, this is shown in the next section, which uses ActiveRecord as the persistent store.

Sunspot on Rails

Recently, Brown released Sunspot on Rails to provide seamless integration between Sunspot and ActiveRecord. All of the machinations shown earlier—persistence, mapping, and lookups—are performed automatically. You must call Sunspot.setup( ClassName ) to define field types for ClassName, but the rest is easy.

Sunspot on Rails requires an additional gem and a few lines of configuration.

$ sudo gem install outoftime-sunspot outoftime-sunspot_rails \
  --source=http://gems.github.com

Since the names of the Sunspot and Sunspot on Rails gems differ from the name of the library each provides, add the following two lines to your gem dependencies.

config.gem 'outoftime-sunspot', :lib => 'sunspot'
config.gem 'outoftime-sunspot_rails', :lib => 'sunspot/rails'

You must also create a config/sunspot.yml file.

common: &common
  solr:
    hostname: localhost
    port: 8983

production:
  <<: *common

development:
  <<: *common
  solr:
    port: 8982

test:
  <<: *common
  solr:
    port: 8981

With those amendments in place, you can start the Sunspot on Rails server with rake.

$ rake sunspot:solr:start

To make an ActiveRecord model searchable, simply usesearchable.

class Book < ActiveRecord::Base
  searchable do
    text    :author
    text    :blurb
    integer :edition
    string  :isbn10, :isbn13
    float   :price
    time    :published_at
    text    :publisher
    string  :sort_title do
      title.downcase.sub(/^(an?|the)\W+/, '') if title = self.title
    end
    text    :title
  end
end

By default, a model is indexed whenever it’s saved, and is removed from the index whenever it is destroyed. Options can alter these defaults.

Once a model is made searchable, search is an analog to find.

results = Book.search do
  keywords 'King'
end

In a scenario where you prefer to not load the data for all matching objects, Sunspot on Rails provides search_ids, which returns only the IDs of objects that match your criteria and not the entire object.

Don’t Just Find. Search.

Sunspot makes complex searches as easy as database queries. Installing and configuring Solr is an additional burden, but it’s not onerous and the community of Solr developers is very cordial.

Mat Brown has also made it simple to index all your data. The class method reindex empties the existing index (if any) and reindexes all records.

For example, to reindex all the books in the bookstore application, run…

Book.reindex

…from within your code, your Rails application console, or deployment task.

If you search for “Rails Solr” on Google, you will also find the acts_as_solr plug-in. I use acts_as_solr now, but plan to switch to Sunspot as soon as possible.

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62