Baseball hacks database

April 7, 2009

For the past few days I’ve been working on building a baseball database of all of the players who have played baseball from 1871 to 2008.  The tricky part in building such a database is gathering statistics of the current season and merging it with the Lahman baseball database.  A book called Baseball Hacks shows you how to gather statistics from the current season by using  data from http://mlb.mlb.com and inserting it into a MySQL database.

One of the drawbacks in merging this data is trying to find a way to cross-reference a player’s playerID in the Lahman database with his mlb.com ID.  A playerID is generated by using the first five letters of a player’s last name and first two letters of his first name.  A number is added to the end of the ID to make it unique in case of duplicates. The playerID for Chipper Jones, for example, is jonesch06.  His mlb.com ID is 116706.  I was thinking since I know the pattern of how the playerIDs are generated in the Lahman database I could somehow use that to link the Lahman database data to the mlb.com data but this method could end up being too inaccurate.

Luckily, I stumbled upon the forums at http://www.baseball-fever.com.  In the Statistics, Analysis, & Sabermetrics area there are some individuals asking how to link these IDs together.   The author of THE BOOK — Playing The Percentages In Baseball posted a file that contains the playerIDs mlb.com IDs of all players.  I should be able to use this information to merge the Lahman database with this current season!  Hopefully, I’ll have a working database of past seasons and the current season soon.

{ 2 trackbacks }

New baseball database
07.09.09 at 4:44 am
How to maintain an up-to-date baseball database
04.15.10 at 4:51 pm

{ 8 comments… read them below or add one }

Ron 04.26.09 at 10:41 pm

Great idea, here. Thanks for suggesting the book, I hadn’t heard of it and I’ll have to check it now.

Jim McCurdy 01.21.11 at 9:34 pm

I know this is an old post, but why do you want to tie the Lehman database to MLB.com? Is there data on MLB.com that you can’t find in the Lehman database?

Thanks,
Jim McCurdy

Rick 03.30.11 at 7:55 pm

Hello,

We are looking to find a way to download MLB boxscores and build a database format from this so we are able to analyze that database. Can you help with this? A feed service or someone that can set this up for us?

THX,

Rick

rea 05.28.14 at 2:03 pm

Greetings, I believe your website might be having browser compatibility
problems. When I take a look at your blog in Safari, it looks
fine however when opening in IE, it’s got some overlapping issues.
I just wanted to provide you with a quick heads up!
Other than that, excellent blog!

pixel gun 3d hack no human verification 11.22.16 at 12:13 am

classic info you’s have at this point how everyones first thoughts on mine page
in connection with

itunes codes free 11.22.16 at 1:00 am

cool site we at this time how you’re first impressions with our
web post relative to

free fifa 17 coins 11.22.16 at 1:33 am

brilliant site we have here what is everybodys first thoughts at our site relative to

avakin life money generator 11.22.16 at 1:50 am

classic post you guys have at this time what are everybodys comments on our web page regarding

Leave a Comment

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>