Saturday, September 29, 2012

The Second Year of CSI without Dead Bodies

In my one year anniversary post, I wrote that I hoped to write about a different list top 10 posts next year.  There is a different list on the built in blogger stat counter but some of the posts like my 100th Post appear to be inflated by bot traffic from places like Russia.  The list on the right 'Popular Posts' is from the built in stat counter from the last 30 days.  The top post from the last 30 days does not register on other stat counters that I use such as Google Analytics.  For this reason I will talk about the top posts from just the last year as Google Analytics tells me.  The all time popular posts from the last year are basically the same just the order has changed.  Google Analytics gives other information such as how long viewers stay on the page, how many pages they visit when they're here, where they come from, and which pages refer them. 

Over the last year I've had 4,028 visitors (3,027 of them unique) with an average of 2.77 pages per visit (32% were single page visits).  Visitors stayed an average of one minute and 12 seconds and were from 85 different countries (71% from the US followed by 4% from the United Kingdom and 3% from Italy and Canada).  79% of visitors were new and 21% were returning. 

The top 10 posts from the last year are as follows.

10.  A Statistical Profile of the Uninsured in Washington, DC, New Mexico, and Texas 

This post has received a steady amount of traffic since it debuted last August.  These are states with some of the most serious uninsured and poverty problems.
9.  A Kinder, Gentler Looney Tunes 

This post has also received a steady amount of traffic from Looney Tunes fans I believe.  This discusses how Warner Brothers has modernized the old characters and speculated as to why.

8. Pitt & PSU going private: Shifting the Tax Burden to College Families & A Bigger Story Than the Pitt Bomb Threats & Joe Paterno 

This post received a lot of traffic while the bomb threats at Pitt were going on, the Jerry Sandusky trial was being adjudicated, and the state government was debating the budget for both schools.  The outcome wasn't as bad as I had feared but the possibility exists of future budget cuts by Gov. Corbett.

7.  Ruth Institute - Making Marriage Cool (In the US but not Scandinavia) 

This post from last year still gets a steady amount of traffic on a right wing think tank's claims on marriage in the US and Scandinavian countries (including a sizable amount of traffic from Europe). 

The Iceberg that Sank that Cursed Ship
6.  Titanic Perspective

This post received a lot of traffic initial on the 100th anniversary of the sinking this year and then it tapered off as attention turned elsewhere.

 5.  Top 10 Worst Super Bowls of All Time

This page has received a steady amount of traffic on the less than super moments of Super Bowl history.

4.  Lance Armstrong's Doping Claim: A Probabilistic Calculation

Another post from last year which was updated with new info and video clips.  It has received traffic in waves when there has been news about  his troubles with the US anti Doping Agency.  I applied probability theory to his claim of never failing 500 drug tests.

3. The Civil War in a Larger International Historical Context
This post from saw a resurgence of interest in spring and summer months as more sesquicentennial (150th) anniversaries of Civil War battles came and went.  I also followed this post up with a post on how Mexico celebrated the 150th anniversary of Cinco de Mayo when they repelled an invasion from Napoleon III of France while we were fighting the Civil War.

2. Global Warming, Wikileaks, and Statistics: What Barry Sanders Can Teach Us

This post made the biggest move on the all time list this year with interest in Global Warming, Wikileaks and Barry Sanders remaining high. Keyword search statistics suggests that those on the web who are searching for Barry Sanders are primarily visiting this page and staying longer.  It was meant to be a teaching example of how his running statistics can explain global warming to sportscenter junkies.  This suggests that it maybe having some of the desired impact.

1. Income and Life Expectancy. What ddiencesoes it Tell Us About US?

My all time most read post is still the most popular in the second year thanks to the link received in the on the webpage for the BBC documentary The Joy of StatsThis post accounted for 14% of all the pageviews this year.  The number of views spikes every time the documentary airs anywhere in the world.  About half of the views were in the United States and 10% came from Great Britain and 7.5% were from Italy for this page.  Some of the longer times on the page came from Germany, Turkey and Japan. 

I am proud of all of these posts and of my other 115 posts which did not make the list.  It is hard to predict how audiences will respond to a post so the more feedback that can be had the better.  I hope you will all feel free to provide feedback and to help me to bridge the digital and technical divide between the us all.

Sunday, September 23, 2012

The Need for Exactness

Whether truthfully or not, stating the need for exactness is an effective tool for planting doubt in the public's mind about your opponents claims.  Some birthers and JFK conspiracy theorists will never be satisfied with the official explanation of their respective claims.

At the bi-Monthly Goo Goo Gathering that the Pittsburgh Coffee Party held, there was a lot of discussion of the PA Supreme Court's decision to have the lower courts reconsider whether the State's voter ID law is feasible.  The Republicans, who passed the law, publicly stated that it is intended to prevent voter fraud but insiders like PA House Majority leader privately stated something else:

The Colbert Report
Blood in the Water - Mike Turzai's Voter ID Remarks
Video Archive

Opponents of the law demanded to see one case of voter fraud and stated that many of the state's poor and minorities would be disenfranchised because of inability to obtain an ID. 

Another example of the need for exactness is Mitt Romney's claim that the 47% of the US who were dependent on the US Government for assistance.  Many of his opponents, once they found out about his comments, were quick to point out who the 47% were: mostly the elderly and the working poor. 

I was also asked another question at the Goo Goo gathering about electronic voter machines that have no paper trail.   How can they be checked for accuracy.  He said he had a sample size of 20 machines to test for malicious software out of about 5,000 in Allegheny County.  He wanted to know if that sample size would be able to detect anything.  I computed the margin of error for the sample for the percent of machines found to be defective.  With a sample of 20 that would mean a margin of error of +/- 22.4% with 95% confidence.  This means that if 50% of machines were found to be defective in the sample (10 out of 20), the actual population proportion would be between 72.4% and 27.6%.  This is a wide margin but we could be confident that the population proportion was different from zero.  He said that these tests have been done before but they have always come out to be 100% not defective.

Another method of checking the machines I told him would be to compare the machine vote totals to the exit poll data or even hold a mock election.  The 2006 CNN exit poll accurately predicted that Bob Casey would defeat Rick Santorum with 59% of the vote (margin of error +/- 2%).  Granted one or two defective counting machines would not have an overwhelming effect on the statewide totals with millions voting.  If precinct level exit poll percentages could be compared to corresponding vote totals, that could provide a better indication on the reliability of the machines.  If that is not feasible then holding mock elections with no secret ballot could be the next best thing with the 20 machines.

I have a similar problem looking at my blog traffic statistics which I will discuss in depth on my second year anniversary post.  My post 100th post is listed as the top post for the month on the built in stat counter for blogger but hardly registers for Google Analytics and Stat Counter and I have a similar post on the PUSH blog with inflated statistics.  That is why independent verification is important.

