Tuesday, May 31, 2005

The Ethics of Sporting Rules

Last weekend at the Indy 500 witnessed a controversy when Jeff Gordon complained that Danica Patrick's small stature (she weighs only 100 pounds) gave her a big speed advantage -- an estimated 1 mph -- and demanded that the IRL modify its rules to mandate a minimum weight for each car-plus-driver combination. Patrick eventually finished fourth. Eric Rescorla over at Educated Guesswork made the following interesting comment in reaction:

"No doubt the weight advantage is real, but so what? Sports are full of situations in which one competitor has a physical advantage over another. Presumably, Gordon's current position is at least partly due to his good reflexes. Should he have to put some kind of damper on his steering wheel so that I have a shot against him? "

I (mis-)interpreted the above statements as an argument saying that Gordon had no right to demand such a rule change and argued against this position. (See here for details.) Eric's subsequent comments clarify that he is really criticizing Gordon's churlishness (a matter of "taste"), not arguing about rights. He also remarks that sporting rules are arbitrary and designed for maximizing interest rather than fairness. (A brilliant example of this is in the evolution of college basketball rules when confronted by players like Russell, Wilt and Kareem.)

While I agreed with him, I still wanted an answer to the question: Is it more "ethical" to require a minimum weight for the car+driver -- negating Danica's weight advantage -- or to only require a minimum weight for the car alone? (Note that both Formula One and Horse racing use the former metric while the Indy 500 uses the latter.)

The answer is tricky because it requires a model of motor racing's goals as a sport. In the abstract, I decided that its primary goal was to compare individual driving ability. Driver weight is a variable that interferes with a fair measurement of driving ability, and should therefore be controlled for by mandating a minimum weight for car+driver. In reality, things are less clear-cut, especially in Formula One, because the sport is as much about the quality of the car as it is about driving ability. And how do we balance against Formula One's bias towards shorter drivers who fit into their cars better? The world is an unfair place.

Monday, May 30, 2005

NBA: Why the Suns live on

Completing a hat-trick of sports posts for the day, I'm happy to note that the crowd-pleasing Phoenix Suns are still in business, having staved off the San Antonio Spurs in an elimination game (ESPN). While the Suns showed signs of an improved defense that double-teamed more aggressively, and came up with a game-winning shot block, the real story was the number of free throws that San Antonio missed. Phoenix had better hope for more of the same, or pull up their socks rebounding-wise, if they want to win Game 5. Because their offense has little room for improvement after shooting 57%.

BTW, one aspect of the Suns game that has been mostly ignored (with the exception of one ESPN article that I can't find any more) is the fact that they are a very non-traditional fast-break team that relies more on open 3-pointers in transition than on lay-ups. That explains why they're still able to fast-break so much despite modern defenses that quickly drop an extra man back to guard the lane (instead of letting him hit the offensive glass like they used to in the '80s). The extra man can stop lay-ups in transition, but can't really guard all the space around the 3-point line. Other Phoenix Suns wannabes, take note. You need 4 good 3-point shooters to make it work.

Finally, Paul "the funniest man in the NBA" Shirley of the Phoenix Suns, continues to post on his playoff blog. Some of his earlier playoff and regular-season posts have made me wish we could simply interview him during the press conferences instead of all those boring players and coaches.

The Baseball Pitch Clock

Have you ever felt terribly bored watching a pitcher read the catcher's signs for what seems like an eternity? Ever wished that every game be pitched by Mark Mulder and Mark Buehrle? Ever bought a TiVo just so you could fast-forward through each pitch and watch the game in 40 minutes flat?
You will love the following proposed change to baseball's rules, originally dreamed up in long-ago discussions with Ananthan: Create a 15-second pitch clock, analogous to the 24-second shot clock in baseball. A pitcher gets exactly 15 seconds to make his pitch counting down from the time the umpire hands out the ball (or the catcher catches the previous one). Miss the deadline, and the hitter gets a walk.

Random Tennis Thoughts

As I am watching some French Open coverage on ESPN, here's what I'm thinking:
  • Both Rafael Nadal and Justine Henin-Hardenne are currently riding 21-match winning streaks. Want to know something weirder? All 42 (but of course, 42) wins have come on clay.
  • Is there a sport that has ever witnessed more choke jobs than tennis? Right from the classic Novotna and Hingis crash-and-burns to yesterday's efforts by Kuznetsova and Gaudio, tennis almost seems to be evolving to eliminate the killer instinct in its players.
  • Marat Safin might be the only person in the world who can beat Roger Federer at his best. But is is also capable of going 5 sets with Tommy Robredo a couple of days after demolishing Juan Carlos Ferrero on clay.

Saturday, May 28, 2005

Stanford and the B-School "Hackers"

There was a big hue-and-cry raised a couple of months ago when someone realized that the admissions decisions made by a number of Business schools was actually available on the official web site although they weren't supposed to be public; all one had to do to get at the results was to make a slight change in the URL that led to the personalized page of the applicant. (See The Volokh Conspiracy for a full account.) When a number of people, quite naturally, decided to check out their admission status, Harvard and MIT decided to get on their high horse and denied admission to all these people citing a breach of "ethics". At that time, I felt proud of the Stanford B-school for showing a little more wisdom (or a little less idiocy) when they decided that they would not reject these applicants outright, but instead required them to send an explanation of their conduct.

Now, it turns out all these candidates were eventually rejected (SF Chronicle) with the dean stating that none of them offered a "compelling explanation". What would constitute a compelling explanation, I wonder? How about "curiosity"? Or did they expect someone to claim that they were held at gunpoint and forced to check out the web site?

If I were one of those rejected students, I would console myself by asking why I'd want to join an institution that lacks the wisdom to acknowledge its own egregious failure in safeguarding information that it considered private, and compounds its failure by blaming my "unethical" actions that consisted of typing in an official web site URL to gain access to information that I was reasonably entitled to anyway. Since when did those actions constitute "hacking"? I hope someone sees fit to sue, if only to publicize this story further!

Reading Philip K. Dick

I have just finished reading "The Minority Report", a collection of some of the early science-fiction short stories of Philip K. Dick, including the title story that inspired the Spielberg movie of the same name. The stories are a must-read for PKD fans, saturated by the familiar theme of muddled protagonists seeking enlightenment in a dystopian future of weak-kneed individuals distracted by trivial pursuits, all under the watchful eye of Big Brother. For newbies, however, I would recommend starting with one of the more "refined" novels of PKD such as "The Man in the High Castle", "Do Androids Dream of Electric Sheep?" or "Flow My Tears, The Policeman Said".

PKD, although relatively obscure in his living days, has gained immensely in stature in recent years as a "fictionalizing philosopher" in the Kafka mould, obsessing over the nature of reality through his SF writings. The story of his life, from his childhood trauma to vertigo to later experiments with drugs and the revelatory experiences that formed the basis of his later works, makes for fascinating reading. Two good starting points for more information about Dick and his work are Richard Corliss's long article in Time magazine and the Philip K. Dick web site.

While there have been many famous film adaptations of PKD's work, from Blade Runner to Total Recall to Minority Report, none of them really captures the spirit of his writing and philosophy. I have very high hopes for the upcoming adaptation of "A Scanner Darkly" to be directed by Richard Linklater. In fact, Linklater's wonderfully strange Waking Life is probably the most faithful representation of PKD's vision on screen, despite the fact that the story is not based on any of PKD's writings! (Although there is an extended segment that references "Flow My Tears, The Policeman Said".)

Friday, May 27, 2005

Modernizing One-day Cricket

The International Cricket Council (ICC) has recently become concerned by the apathy with which one-day cricket is greeted by spectators nowadays, and constituted a committee to come up with proposed changes to enliven the game. As is usually the case with committees full of famous ex-cricketers rather than smart thinkers, it missed the boat with a strange list of proposed changes that, as far as I could tell, failed to address the core problems with one-day cricket.

As I see it, cricket is faced with three big problems:

  • Lack of Drama: As I discussed in an earlier post, the first inning of a cricket match is simply devoid of drama and immensely boring to the average spectator.
  • Influence of the Toss: The coin toss that determines who bats first has taken on a progressively larger role in determining the outcome of the game. With the ubiquity of day-night games and the poor state of pitches, the team batting first often has a mammoth advantage that makes for uneven contests.
  • Dominance of Batting: While spectators often prefer to watch high-scoring games, the balance of power in the game has been tilted so much in favor of the batsmen -- thanks to featherbed pitches -- that the game is reduced to a batting contest (where the contestants never face each other head on) instead of being a contest between bat and ball.

And what does this Rules & Regulations committee do to address these problems? Precious little, as far as I can see. It proposes to allow substitutes which seems like a lousy idea that (a) does nothing to attack the above problems; (b) breaks the traditional spirit of cricket which demands that the same eleven do both the batting and the fielding; and (c) exacerbates the influence of the toss, since a team would like to overload the eleven with a batsman (in the hope of winning the toss and batting first), and then substitute in a bowler during the next inning.

The other big idea was to increase the length of fielding restrictions, which tilts the game even more in favor of the batsmen. However, the notion of letting the fielding team control when it uses the restrictions is good, as it gives some succour to fielding sides when they have to combat pinch-hitting.

What I would have done would be to break up each inning into two blocks of 25 overs, creating a total of 4 quarters (half-innings) in the match. The team winning the toss decides whether it wants to bat in the first or second quarter, and the team losing the toss chooses between the third and fourth quarter. A new ball is used for each quarter. There are many advantages to such a system:

  • The influence of the toss is minimized since teams get a more even dose of all playing conditions. In particular, they both have to field during the day and bat at night.
  • The influence of the pitch is also reduced for the same reason.
  • There is greater drama as we get to easily compare the progress of the teams from the second quarter on.

Thursday, May 26, 2005

Querying and Extracting Web Data

It has always been a source of frustration to me that it is non-trivial to pose queries over structured web data. To take a simple example, consider the sequence of tasks I had to undertake to come up with the movie statistics in my previous post:
  1. Go to Roger Ebert's Great Movies page to find his movie list in HTML.
  2. Fire up Excel. Use its import/export facility to convert the HTML table to plain-text.
  3. Use emacs reg-exp to create uniform field separators.
  4. Fire up SQLite command line, create a table schema and import the text file.
  5. Discover SQLite import bug which screws up when the last field is of numeric type.
  6. Apply some Unix "cut" and "paste" commands to shuffle the columns of the text file.
  7. Go back to SQLite, modify table schema and import data.
  8. Type out SQL Query: SELECT year/10 as decade, count(*) FROM Ebert GROUP BY decade.
  9. Enjoy results.
In hindsight, I should probably have just written a Perl script to parse the HTML and load the data directly into SQLite via the Perl DBI driver. Or even faster, just stare at the web page and manually compute my statistics. The above task is actually relatively simple because I was able to import all the data I needed and query it afterwards. But I often want to do much more complex queries that can't afford to crawl all external data ahead of time.

Here is an example query: Find all movies in my local database that I have given a high rating to (yes, I have a database of movies I've seen), but which have a much lower rating at IMDB. This query requires a join between data available locally (movie title and my rating) and data off the web (IMDB rating). It is fairly straightforward to write a script that executes this query, but what would be really nice is a good tool that lets us simply pose such queries and can automatically (ok, with a teeny bit of human input) go off and figure out how to execute them -- a simple, usable query engine over a data integration system that works with web sources.

On the bright side, Alon Halevy's Semex project at Washington appears to promise all that I want. (Worryingly, Alon has a tendency to avoid my Stanford talks like the plague. Maybe the mere fact of my mentioning this problem is enough to turn him off this line of research? :-))

Generation Gap in Movie Tastes

I was recently analyzing some movie data when I found evidence to back up a long-held conjecture of mine: People preferentially like movies from the era they grow up in.

Note that the statement isn't merely about what people actually watch (they usually don't watch older movies due to various cultural and economic biases) but about what people would like watching, assuming they are exposed to movies from all eras. As evidence, I dug out and analyzed two different lists of movies: (1) Roger Ebert's list of 240 "Great Movies" drawn from his web site; (2) My own list of 213 movies that I rated 8.5 or above on a scale of 10. I broke down the distribution of these movies by decade to arrive at the following table:

The numbers tell an intriguing story. (BTW, Ebert was born in 1942 and I was born in 1979.) One could attribute the discrepancies among the movies from the 1910s and '20s to the fact that Ebert is more of a film historian than I am, and tends to include "landmark" movies that , in my opinion, don't necessarily hold up well today. The discrepancy in the 2000s could be put down to the fact that these movies are too recent to be included in Ebert's "Great Movies" series.

But the remaining stats reveal a cool trend. Ebert's numbers start off low in the 1910s, keep increasing continuously until reaching a "flat" peak in the '50s, '60s and '70s (what I would call the "Ebert era"), and then start tailing off after that. My numbers, on the other hand, show a progressive increase all the way up to the '90s (and will probably continue into the 2000s, given that we are only halfway there).

One could postulate alternative explanations for this trend. For example, you could argue that I have seen far fewer old movies than I have new ones, and I am therefore naturally biased towards the newer movies. But I would present the following pieces of evidence in opposition:
  1. Most movies I've seen have been on VHS and DVD and I've had little reason to discriminate in favor of new movies;
  2. I've had access to a broad selection of titles thanks to good libraries and Netflix;
  3. I tend to vet movies by the critical reception they received before I watch them. So, if anything, I should be biased towards movies that Ebert likes.
  4. There are many old movies in Ebert's list that I did not like as much.
  5. There are many new movies on my list that Ebert did not like as much. (Usually, they tend to be post-modern "form is content" movies like Fight Club or The Usual Suspects.)
  6. It seems hard to believe that mere limitations of access can explain the fact that I like progressively more movies from each decade starting from the '30s.

Does anyone have any corroborative/conflicting evidence?

UPDATE: (1:02PM) I should add a disclaimer that there are a number of other factors that have not been controlled for in the above analysis. For example, Ebert's intentions in creating his Great Movies series might have been to popularize relatively old and obscure movies. Some '90s movies might be considered too recent to make Ebert's cut. Ebert might have more refined movie-watching tastes and the movies might be becoming more populist by the year, etc. I still think I'm on to something though.

Wednesday, May 25, 2005

TV: The Present and the Future

These two links are courtesy Bill Simmons and his intern.

Bill Simmons over at Espn.com came up with this, umm, blow-by-blow account of yesterday's final episode of "The Contender". So, in case you missed out on the live action, this should do it for you.

Conan O' Brien, on the other hand, got on his time machine to get a sneak peek into the Future of Television.

Cricket vs. Baseball

Like many a cricket fan stuck in the US and bored to death of summer TV, I took to watching baseball more by default than out of any inherent fascination with the sport. (BTW, I've always held that cricket is the ultimate made-for-American-TV sport, with its built-in commercial breaks every 4 minutes. But that's a story for another day.) Over the years, however, I have come to appreciate -- prepare for sacrilege here -- some of baseball's advantages over cricket, despite being slower-paced, limited in tactics and lacking in strategy.

The key area where baseball wins is in suspense and drama and this can be traced to two elements of the game:

1. The At-Bat: Each at-bat, while often meaningless in the grand scheme of things, nevertheless builds tension through the balls-strikes sequence. Even if a hapless Adam Melhuse is facing Pedro Martinez down two strikes in a hopelessly lost ballgame, we still find ourselves filled with an insatiable curiosity to know what happens next. (It is as if every at-bat is a musical piece that keeps building tension until it ends in the "key" of an out or a hit.)

2. The Inning: Baseball's other big advantage is its use of innings to lend weight to many at-bats. With the prospect of stranding men on base at the end of an inning, at-bats suddenly become more crucial and weighty. (And if you are an A's fan, agonizing.) Moreover, we get to monitor the progress of both teams on an inning-by-inning basis making it easy to figure out, at all times, which team is ahead in the race.

In contrast, one-day cricket suffers from the fact that the first inning is bereft of competitive suspense, since we often have no clue as to which team is doing better. While one might argue that this is exactly what makes the game strategically complex, it also serves to lower the interest of the casual spectator, especially when the other spectator attraction -- superlative displays of batting prowess -- also comes a cropper in the middle overs. Come to think of it, watching the first inning of a cricket match is not too different from watching the opening moves of a chess game; it's all very interesting if you can follow what's going on, but it may be none too exciting otherwise.

More on what cricket needs to do to fix itself in a later post.

Tuesday, May 24, 2005

How many people watched Star Wars?

Here is a staggering statistic about the popularity of Star Wars: Episode III - Revenge of the Sith. The movie made $17 million in the U.S. solely from its midnight shows on Thursday morning. Assuming the average ticket price for a new release is $8 (the overall average across all movies is $6.23 according to NATO -- the National Association of Theatre Owners), that works out to more than 2 million people.

By the end of Day 1, the movie had made $50 million, amounting to 6.3 million viewers, or close to 2.25% of the American population! And by the end of the weekend, the number is up to 19 million viewers (assuming no one was masochistic enough to see it multiple times in the meanwhile).

But here is the kicker: 19 million is still pretty puny when compared against the heydays of cinema in 1946 when -- hold your breath -- an average of 78 million people went to the movies each week. (Source: Slate)

Star Wars and The Prisoner's Dilemma

Star Wars: Episode III - Revenge of the Sith was released in theatres at 12:00am on the morning of Thursday, May 19. I had managed to wangle a free ticket to one of the midnight shows, thanks to Google (and my contacts who shall remain nameless), which had reserved a whole screen at AMC Mercado. I showed up at 11:20pm and, in an ominous sign, had a devil of a time finding a parking spot within a quarter mile of the theater. I resigned myself to living with a front row seat.

Imagine my surprise, then, when I walk in and find tons of empty seats all over the place, even a few in the back row! Given that the parking lot was full (and presumably, all other screens were packed to the brim), I was left with three theories to explain this strange set of circumstances:

1. Google employees (and friends thereof) simply aren't big Star Wars fans.
2. Google employees are neutral to what seats they watch a movie from.
3. Google employees understand the Prisoner's Dilemma.

Theory (1) was quickly ruled out in a few minutes as the place filled up (partially with light-saber-wielding Jedi) right before the movie started. Theory (2), while possible, appears improbable since I can think of no reason why a Googler would be seat-neutral when the average human is not. Which leaves us with Theory (3).

Finding seats in a theatre is a classic example of the Prisoner's Dilemma. For simplicity, imagine that each person can come in at one of two times -- Early or Late -- and everyone's objective is to find good seats while coming Late. Now, a selfish person would try to come Early -- ahead of the others -- to get the best shot at a seat. But if everybody were selfish, they would all end up coming Early and will have to rely on luck to get good seats. The net effect is that everyone's time is wasted without actually improving anyone's utility. If instead, they had all agreed to come Late, they could at least have saved themselves their time and would still have had the same shot at landing a good seat. Cooperation is, therefore, key to resolving the Prisoner's Dilemma, as the employees at Google seem to have realized!


To blog or not to blog, that has always been the question. After years of trepidation weighing the benefits of sharing my musings with the world against the cost of all the person-months spent reading them, I have decided to go boldly where so many men and women have gone before. I shall blog and publicize my idle thoughts, with little regard for the consequences to humanity at large. After all, how could even my mightiest efforts have any significant effect on a world reeling from the monumental damage caused by the "Star Wars" movies? (More on Star Wars later.)

So, here goes. You have been warned. Read on at your own peril.