Find my Book

2011 / Find my book is an ongoing personal project of mine that started in March 2011. After seeing the power of GoogleDoc functions like ImportHTML, I realized the same things could be accomplished with PHP cURL functions. The project started out with the intention of cross referencing college text book websites, Craigslist, personal student textbook selling websites, and Google in order to find the true "cheapest price" of a book. The project is largely unfinished and I really only use it to play around with PHP. All code was written by me unless otherwise stated in the function head comment.

Things I Liked:

  • Exploring the functions of cURL and picking apart websites to gather information
  • Combining this data into a database and comparing it opened my mind to how powerful similar content aggregators can be
  • Taking advantage of functions to eliminate duplicate code (although right now there is a lot of duplicates simply for testing)

Relevant PHP Sample Files:

Features Implemented

  • Google and Textbook website scraping for title, price, and ISBN numbers
  • Ability to input the number of results to retrieve per page, the number of pages, and different search terms
  • Weight/Confidence values assigned for various websites to determine how effective results were during previous runs
  • Mutex joins using a "Domains" table to determine if the website has been visited before or not