Search 5 Million Print Books in Microseconds

Damn that John Durant. Just as I was beginning to get a handle on all my tasks he comes up with something that’s just too fun not to play with.

It’s Google’s search engine for the millions of books they’ve digitized over the years. John explains.

Google has been digitizing books for years now, and a team of researchers just published some amazing findings based on the corpus so far — over 5 million books. That’s 4% of all books. Ever. Printed. You can read about the paper here. The full paper is easy to read, and incredible (free with registration). Google also released this tool to view the frequency of any n-gram since 1720. (An n-gram is a set of characters separated by a space. 1-gram = technology, 1942, R2D2; 2-gram = yellow fever, John Wayne. And so on.) I wish they had tools like this when I was writing history papers back in college.

Careful, ’cause it’s addicting. But you can essentially conduct your own mini studies for associations as much as you like. Here’s my stab: whole grains, saturated fat and obestity epidemic. You can click on it to see the larger version or, simply click here.

Whole Grains Saturated Fat and the Obesity Epidemic
Whole Grains, Saturated Fat and the Obesity Epidemic

What I find particularly interesting is how both mentions of whole grains and saturated fat track in lockstep with one-another beginning in about 1970, and then you have mentions of the obesity epidemic taking off in 2000, 30 years later. Could that have anything to do with that 10 pounds per year gain (= 30 pounds and you’re obese)? I don’t know. Of course, this is only valuable for getting hints of what you might want to study further.

What would be interesting is to plug those same terms into a tool that does the same thing for all published research. Whaddya want to bet it looks about the same?

Alright, one more, for fun. How about Atkins diet vs. low fat?

Atkins diet vs low fat
Atkins diet vs low fat

Interesting how the Atkins diet came out right at the dip in low fat mentions. Not sure how to hypothesize an interpretation on that.

Alright, go have fun. If you find interesting associations, drop the link in comments here, at John’s or both.

(Can’t wait to see what Ned Kock come up with).

Richard Nikoley

I'm Richard Nikoley. Free The Animal began in 2003 and as of 2021, contains 5,000 posts. I blog what I wish...from health, diet, and food to travel and lifestyle; to politics, social antagonism, expat-living location and time independent—while you sleep—income. I celebrate the audacity and hubris to live by your own exclusive authority and take your own chances. Read More


  1. Eegah! on December 17, 2010 at 17:32

    Too much fun!

    god vs. evolution

    Something bad started happening in the late 1970s

    • Richard Nikoley on December 17, 2010 at 17:53


      Watch how they all converge in 1960 and the three former remain converged and evolution takes off.

      OK, now I’m really hooked.

    • John Hart on December 17, 2010 at 21:41

      you get a very different graph if you capitalize God, Jesus, and Allah:

    • Richard Nikoley on December 17, 2010 at 18:00

      Just religion.

      Very interesting.

    • Victoria on January 5, 2011 at 08:36

      religion and science

      Thanks to

      Saw it, had to come back and find this post on your blog…

    • Richard Nikoley on December 17, 2010 at 17:48

      Excellent on the former.

  2. Nigel Kinbrum on December 17, 2010 at 18:16

    Low carbohydrate diet vs low fat diet.

    • Richard Nikoley on December 17, 2010 at 18:23

      You gotta do a wide date range and then narrow it to take out the fluff. Here, Nigel, same thing 1980 to 2008:

      I do like that dive in low fat over the last couple of years.

    • Richard Nikoley on December 17, 2010 at 18:24

      Take out ‘diet’ from both:

  3. Jim Stone on December 17, 2010 at 18:19

    low fat vs tooth decay:

    • Richard Nikoley on December 17, 2010 at 18:26

      Now you’re thinking, Jim. non-obvious associations.

      I don’t know about you guise, but this is a toy

  4. Steve Parker, M.D. on December 17, 2010 at 19:15

    Not an association study; just “paleolithic diet.”

    I’m skeptical about the database. Perhaps the recent books aren’t indexed yet.


    • Juan on December 19, 2010 at 07:40

      Good point, Steve. There are copyright issues involved with anything newer unless they have permission from the copyright owner. As you may know, before a book can become public domain, the copyright owner –usually the author– has to be dead for 70 years. Google Books does not always include entire books; sometimes it’s just snippets of the volume or a virtual index card, and they can thus present a book and not infringe copyright laws.

  5. Aaron Blaisdell on December 17, 2010 at 21:54
  6. Eegah! on December 17, 2010 at 23:28

    Four of humanity’s favourite ‘pastimes’, during the 20th Century.

  7. PK on December 17, 2010 at 23:41

    Looks like we got a lot more interested in chicken than beef around the low-fat craze.

  8. julianne on December 18, 2010 at 00:05

    Vitamin C, Vitamin D, Vitamin E

    Then pick an early year to look at and see what has been writen about each – fascinating eg vit c,cdr:1,cd_min:1938,cd_max:1942&lr=lang_en
    Life article
    “How to hold on to youth and retard old age” …

  9. jonas on December 18, 2010 at 00:32
  10. Mike Palmer on December 18, 2010 at 01:37

    Obesity and microwaves:

  11. julianne on December 17, 2010 at 16:13

    This is fun – thanks for pointing it out.

    Just did hyperactivity and ADHD, then Vitamin D and codliver oil

    • Richard Nikoley on December 17, 2010 at 16:16

      Put up links if they’re interesting, ie, not a mash of incomprehensible graphs.

  12. eco on December 19, 2010 at 02:54

    Will be even better if you put actual obesity statistics superimposed over the mentions of “saturated fat” and “whole grains”… 😉

  13. Brian T on December 22, 2010 at 12:50

    skin cancer, flu

Leave a Comment

Follow by Email8k