Rboss RubyGem for Yahoo! Search BOSS
August 23rd, 2008
With Search BOSS (Build your Own Search Service) Yahoo has freed up a lot of the restrictions on their previous search service. Like removing the cap on the number of searches and allowing re-purposing of results. I’ve been doing some work on using the service in Ruby. I wrote a little RubyGem called Rboss which wraps around the BOSS webservice. It makes life nice and easy using Ruby and BOSS.
require 'rubygems'
require 'boss'
api = Boss::Api.new('boss-api-key-got-from-yahoo')
#Find news articles that are not older than 7 days
results = api.search_news('monkeys', :age => '7d')
results.each do |news|
puts news.title
puts news.abstract
puts news.date
puts news.url
end
Install Gem from GitHub:
- Add github to gem sources
- Install the gem:
- If you don’t already have a BOSS api key signup for one: http://developer.yahoo.com/wsregap
gem sources -a http://gems.github.com
sudo gem install eshopworks-rboss
Checkout the Rboss documentation and example usage at: http://github.com/eshopworks/rboss-gem
Thanks to eShopworks for sponsoring this project.
Latent Semantic Analysis in Python
December 19th, 2007
Latent Semantic Analysis (LSA) is a mathematical method that tries to bring out latent relationships within a collection of documents. Rather than looking at each document isolated from the others it looks at all the documents as a whole and the terms within them to identify relationships.
An example of LSA:
Using a search engine search for “sand“.
Documents are returned which do not contain the search term “sand” but contains terms like “beach”.
LSA has identified a latent relationship, “sand” is semantically close to “beach”.
There are some very good papers which describing LSA in detail:
- An introduction to LSA: http://lsa.colorado.edu/papers/dp1.LSAintro.pdf
- Creating your own LSA space: http://www.andrew.cmu.edu/user/jquesada/pdf/bookSpacesRev1.pdf
- Latent Semantic analysis: http://en.wikipedia.org/wiki/Latent_semantic_indexing
This is an implementation of LSA in Python (2.4+). Thanks to scipy its rather simple!
Building a Vector Space Search Engine in Python
November 27th, 2007
A vector space search involves converting documents into vectors. Each dimension within the vectors represents a term. If a document contains that term then the value within the vector is greater than zero.
Here is an implementation of Vector space searching using python (2.4+). Read the rest of this entry »