Last month, we looked at adding search to a site using the open source ht://Dig search tools. As you’ll recall, ht://Dig handles the crawling, indexing, and search duties. However, not everyone has the access or resources required to install ht://Dig, so this month we’ll try an alternative approach — using Google from PHP.
Last month, we looked at adding search to a site using the open source ht://Dig search tools. As you’ll recall, ht://Dig handles the crawling, indexing, and search duties. However, not everyone has the access or resources required to install ht://Dig, so this month we’ll try an alternative approach — using Google from PHP.
It’s the API, Stupid
Google is built on one of the world’s largest deployments of Linux servers (somewhere around 10,000 machines). Not only is Google an amazingly powerful Web search engine, it’s also available as a web service. That means you can easily write code to query Google and manipulate the results programatically, without having to resort to “screen scraping” tricks. (For more about web services, see the August 2002 issue, available online at http://www.linux-mag.com/2002-08/web_services_01.html.)
Since its public release, the Google web service has been put to a variety of strange and amusing uses. GoogleFight (http://www.googlefight.com) is one of the most entertaining. Here, our goal is to produce a Google clone that searches only your web site. To do that, we’ll need to develop the necessary PHP code to handle a simple form (a search box), query Google, and produce a list of matching documents.
To get started, there’s a bit of software and information that you’ll need to collect. First, pay a visit to Google’s API page (http://www.google.com/apis), download the developer’s kit, and create a Google account. The developer’s kit contains an API reference as well…
Please log in to view this content.
Not Yet a Member?
Register with LinuxMagazine.com and get free access to the entire archive, including: