✓ I'm available for hire! Check out my open source work on Github or drop me an email

Douglas F Shearer

Posts Tagged with hpricot

There are 2 matching posts.

Ruby Gems Installation And Compilation On OpenSolaris

I’ve had a bit of trouble installing various Rubygems on my system, mostly Hpricot and Ferret which both require some code compilation. In an effort to save others the same trouble, I’ve compiled a list of tips here to make life easier…

  • Change your rbconfig.rb file in the RubyGems folder using the one created by Joyent’s Benr. This will help RubyGems find make, cc, gcc etc, which are not in their usual places in Solaris/OpenSolaris.
  • Install cc by getting the Sun Studio. You have to be registered, but if you downloaded OpenSolaris from Sun, you’ll already have an account.
  • Install gcc3. If you are using BlastWave to manage your packages simply run pkg-get install gcc3 to get it.

Hopefully this will help some of you out.

Other Great OpenSolaris/Solaris Tips

Fellow Edinburgh Rubyist Graeme Mathieson has put together some great posts on his experiences with Solaris on his new Sun Thumper. Very lucky man!

 
 

Site Search Using Google In Ruby On Rails

Normally for searching in a Rails app when people ask about searching, I suggest using Ferret, the only downside to this being that it can only search model data, and not static content that may be marked up manually.

Enter Google. They index all of your content, no matter how it is generate. So why not use them for the search?

Prerequisites

Make sure you have the Hpricot gem installed –

gem update hpricot —source \
http://code.whytheluckystiff.net

Make sure you include both of them in your environment.rb as follows…


require 'hpricot'
require 'open-uri'

Controller

Generating the query string and getting the results is simple enough to do in a controller method, like so…

  1. Search site using google
    def google
@query = params[:id] @start = params[:start] if params[:start] @start ||= “0”
  1. Site url as well as any other conditions you’d like.
  2. I chose to ignore all of my tag cloud pages, pagination pages and date pages.
    site = ‘douglasfshearer.com -“tagged with” -“posts by date” -page’
uri=“http://www.google.com/search?q=#{URI.escape(‘site:’+site+’ ‘@query’&start=’+@page.to_s)}” html_result = open(uri) parsed = Hpricot(html_result)
  1. parse out the number of results.
    @no_results = parsed.to_s[/<\/b> of about (\d*)<\/b> from/,1]
@results = (parsed/“div.g”).map do |ele| {:title => ele.at(“a”).inner_text, :link => ele.at(“a”)[‘href’],
  1. Huge fat hack alert. Use gsub to get rid of the weird stuff around the bold statements.
    :description => (ele/(“font”..“font/br”)).to_s.gsub(/\221/,‘’).gsub(/\222/,’’)}
    end

View

A very simple view for this can be done as so…

<%- if @results -%>
<h3> <%= @start.to_i+1 -%> - <%= @start.to_i+10 -%> of about <%= @no_results -%> results.</h3>

<%- @results.each do |r| -%>
	<h4><%= link_to r[:title], r[:link] -%></h4>
	<p><%= r[:description] -%></p>
<%- end -%>

<%= link_to 'Prev', :start => @start.to_i - 10 if @start.to_i >= 10 -%> | 
<%= link_to 'Next', :start => @start.to_i + 10 if @start.to_i < @no_results.to_i - 10 -%>
<%- end -%>

There you go. It’s just proof of concept, is a bit dirty in places, and uses the uri :id for the query string, but it works and has a few niceties such as pagination. Go play, and report back on how you get on.

Thanks

Core code, hpricot and inspiration by _why.

Nicholas Wright for regular expressions and other random banter.

A Word Of Warning

Don’t hit google with identical queries too often, you may find your IP is blocked by Google for a short period of time. Mine was, so this is probably best not used for a large production app. I would think this probably breaches Google’s T&Cs too.

Did you like my Ruby on Rails related article? Then why not recommend me on Working with Rails.