Donation Goal
Donate Now Goal amount for this year: 799 USD, Received: 100 USD (13%)
Please donate to help support this website. The funds will be used to purchase owned license of LiteSpeed Web Server Enterprise (2-CPU). It provides superior performance in terms of raw speed, scalability and anti-DDoS capabilities.

Internet Archive Wayback Machine Takes You Back

Posted By Raymond In Category: Computer

Jul
21
2007

Found a very interesting website that captures billions of web pages and archive them.
Time Machine for web site
The Internet Archive Wayback Machine is a service that allows people to visit archived versions of Web sites. Visitors to the Wayback Machine can type in a URL, select a date range, and then begin surfing on an archived version of the Web. Keyword searching is not currently supported. Imagine surfing circa 1999 and looking at all the Y2K hype, or revisiting an older version of your favorite Web site. The Internet Archive Wayback Machine can make all of this possible.

I tried searching for www.raymond.cc at the internet archive wayback machine and I found…


The empty page before this blog even began.
Before Blog Start at Raymond.CC

Then I also found the first design I used for this blog.
First WordPress Template

Brings back the memories…

Anyway, the Internet Archive Wayback Machine contains almost 2 petabytes of data and is currently growing at a rate of 20 terabytes per month. Terabytes is not very common and the Internet Archive Wayback Machine already has petabytes of data!

If you have a web site, and you would like to ensure that it is saved for posterity in the Internet Archive, and you’ve searched wayback and found no results, here’s what you can do to allow the Internet Archive Wayback Machine crawl your web site.

Method 1: Visit the Alexa’s “Webmasters” page.

Method 2: If you have the Alexa tool bar installed, just visit a site.

Method 3: While visiting a site, use the ‘show related links‘ in Internet Explorer, which uses the Alexa service.

Sites are usually crawled within 24 hours and no more than 48. Right now there is a 6-12 month lag between the date a site is crawled and the date it appears in the Wayback Machine.

[ Visit Internet Archive Wayback Machine ]


Related posts:
  • Repair Damaged or Corrupted ZIP Archive
  • Decode Yahoo Messenger Messages Archive
  • Maximize New Internet Explorer Windows
  • Quiet Internet Pager – Alternative ICQ Client
  • Register and get a Live.com or localized Live account
    • Pingback: The Malaysian Blogosphere » Blog Archive » Raymond.CC: Internet Archive Wayback Machine Takes You Back

    • asdf

      The problem with the wayback machine is that it uses the robots.txt file for people that do not want their sites to be archived. This is fine for your current site, but there are a lot of sites that were closed, and then the domains were bought by another company. So this new company places a robots.txt file, and you’re banned from browsing the complete history of that site. This happens a lot lately, people complain on their forums all the time, and a lot of sites are not available because of it.

    • Chibuzor

      Please Raymond,
      I have to first of all thank you for all the information you have been providing for we guys out here. I am very happy for you.

    Copyright © 2005-2012 - Raymond.CC Blog