Stephen Baker

The Numerati
Home - Viewing one post

Keeping count of people (and things)

June 15, 2010Hop Skip Go

I learned while researching The Numerati that the Chinese have 11 different spellings for Osama Bin Laden. (Maybe it's up to 12 or 13 by now.) So if the quants at the National Security Agency were attempting to monitor Chinese Web traffic about the Al Queda leader, their computers have to recognize all of these different spellings, and group them.

At the same time, I share a name with a prominent author who wrote best-selling books such as How to Live with a Neurotic Dog. Smart systems have to figure out that we're not the same person. (This, of course, is a huge issue for thousands of people whose names condemn them to no-fly lists.)

It sounds easy, but one of the toughest challenges in digging through unstructured data is to come up with accurate counts of people and entities. Jeff Jonas has a very thoughtful blog post and article on this. He writes:

it is essential to understand the difference between three transactions carried out by three people versus one person who carried out all three transactions.  Without the ability to determine when entities are the same, it quickly becomes clear that sensemaking is all but impossible....I find most organizations have underestimated this principle: If a system cannot count, it cannot predict.


add comment share:






©2020 Stephen Baker Media, All rights reserved.     Site by Infinet Design







Kirkus Reviews - https://www.kirkusreviews.com/book-reviews/stephen-baker/the-boost/

LibraryJournal - Library Journal

Booklist Reviews - David Pitt

Locus - Paul di Filippo

read more reviews



Prequel to The Boost: Dark Site
- December 3, 2014


The Boost: an excerpt
- April 15, 2014


My horrible Superbowl weekend, in perspective
- February 3, 2014


My coming novel: Boosting human cognition
- May 30, 2013


Why Nate Silver is never wrong
- November 8, 2012


The psychology behind bankers' hatred for Obama
- September 10, 2012


"Corporations are People": an op-ed
- August 16, 2011


Wall Street Journal excerpt: Final Jeopardy
- February 4, 2011


Why IBM's Watson is Smarter than Google
- January 9, 2011


Rethinking books
- October 3, 2010


The coming privacy boom
- August 17, 2010


The appeal of virtual
- May 18, 2010