Friday, May 23, 2008
« Creating cross platform GUI's with IronR... | Main | It's time to give up on Twitter »
I have a small fantasy football related project that I'm going to be working on that requires access to the full 2008 NFL Schedule.  I decided to use Hpricot, and scrape the schedule from ESPN.  The CSV file I created with the output from the below script can be found here: nfl-schedule.csv (11.48 KB)  

Download script (parse_schedule.rb)

#!ruby
require 'rubygems'
require 'hpricot'
require 'open-uri'

class Game
   attr_accessor :date, :week, :away_team, :home_team, :time

   def to_s
      "#{@date} #{@time} #{@away_team} at #{@home_team}"
   end

   def to_csv
      "#{@week},#{@date.gsub(",", "")},#{@time},#{@away_team},#{@home_team}"
   end
end

def parse_games(doc)
   games = []
   doc.search("//table[@class='tablehead']//tr").each do |tr|
      @week = tr.search("/td/a").inner_html if(tr[:class] == 'stathead')
      @date = tr.at("td").inner_html if(tr[:class] == 'colhead')

      teams = []
      tr.at("td").search("a").each do |team|
         teams << team.inner_html
      end

      if(teams.size == 2)
         @time = tr.search("td:eq(1)").inner_html
         game = Game.new()
         game.date = @date
         game.week = @week
         game.time = @time
         game.away_team = teams[0]
         game.home_team = teams[1]
         games << game
      end
   end
   games
end

games = parse_games(Hpricot(open("http://sports.espn.go.com/nfl/schedule")))
games.each do |g|
   puts g.to_csv
end
puts "Total games: #{games.size}"
Saturday, June 28, 2008 5:22:16 PM (Eastern Daylight Time, UTC-04:00)
Hello,

I was wondering how do I run your code? Cant seem to find the complier.

Also do you have the code for the project you were working on as well.

Sergio
Sergio
Saturday, June 28, 2008 5:42:19 PM (Eastern Daylight Time, UTC-04:00)
It's a ruby script so you'll need to install ruby to run the script. Once ruby is installed you need to install hpricot (http://code.whytheluckystiff.net/hpricot/wiki/InstallingHpricot). Once you have ruby and hpricot installed and have the script above downloaded you can run it by running the following form a command prompt.

> ruby parse_schedule.rb
Saturday, June 28, 2008 9:40:22 PM (Eastern Daylight Time, UTC-04:00)
Thanks for the info.

What language did you code your fantasy football project in. I have a simular task was wondering if you mind sharing your code.

Sergio
Sergio
Saturday, June 28, 2008 9:48:05 PM (Eastern Daylight Time, UTC-04:00)
Thanks for the info.

What language did you code your fantasy football project in. I have a simular task was wondering if you mind sharing your code.

Sergio
Sergio
Wednesday, August 06, 2008 6:22:43 PM (Eastern Daylight Time, UTC-04:00)
Just wanted to thank you for the csv schedule! You've saved me tons of time on my own project.
Wednesday, August 06, 2008 6:26:57 PM (Eastern Daylight Time, UTC-04:00)
Jaime, Glad it was useful! Enjoy

~ Steve
Monday, August 11, 2008 2:10:04 PM (Eastern Daylight Time, UTC-04:00)
Seems a lot of people are using hpricot for doing scrapes of data from HTML these days. Thanks for doing this, it is exactly what I needed, and the hpricot stuff will help should I need some more info too.

Thanks again!
Name
E-mail
Home page

Comment (HTML not allowed)  

Enter the code shown (prevents robots):