All Gramene, all the time

Posted: December 19th, 2008 | Author: Ken Youens-Clark | Filed under: Uncategorized | No Comments »

As the Gramene project manager, pretty much everything I do is directly related to that. I got a chance in November to visit the lab to learn about our new Ensembl Mart building from Richard Holland, one of Will’s cohorts and the hired gun who wiped our Marts into shape. It’s a complicated, eye-crossing, tedious process to flatten our normalized relational databases through the many Mart tools into a structure specifically designed for data integration and mining. In the end, however, we get a system that can synthesize custom data sets not available anywhere else on our site. What’s been interesting for me personally is that, having taken on the Mart mantle, I’ve been trying to answer the user queries for Mart myself. When I get stuck, Will helps me along; when the Marts won’t support the query, I rebuild them. I was really proud of myself when I was able to answer how to get all the rice QTLs for abiotic stress mapped to chromosome 1. Moving forward, I’d like to add more data to our genes Mart (like ontology terms) and probably create Marts for our pathways database and the new association data from our diversity databases.

Even though Gramene is releasing only twice a year, we’re allowing interim updates of the site and important upgrades when they come along. Recently we recommissioned our old web server, filetta, as a public-access (and read-only) MySQL server with all of our Ensembl dbs, the markers db, and a few other dbs. I was able to use that to help a grad student at UT answer some questions about our genes db — how easy to tell him to connect directly to a running copy that he could query. (And, of course, I pointed him to the “mysqldump” so he could set it up locally for himself.)

Another release I made just this week was to reinstate the OMAP stacked maps in CMap. This was a challenge for me because Ben and Bonnie always did this in the past, so I had to follow their docs which ended up being pretty out-dated since so many things have changed since they both left. I hope that it will be easier this next release.

Since users often ask about our schemas, I finally finished up a “Build” action that I’d started long ago. If, from the Gramene root directory, you “./Build schemas” and “./Build schema_diagrams,” the Build.PL script will do a “mysqldump” of all our defined “” (in “gramene.conf”) into a “schemas” directory and then use “sqlt-graph” (from my SQL::Translator project) to create a schema diagram. It’s quite useful documentation to see how all the tables in the schemas are related.

One other nice improvement we’ve made recently to Gramene is the use of an actual blog for our news items rather than updating our index page all the time. I installed WordPress on bivouac and then run a cron job on the top of the hour to grab the latest three posts from the RSS feed, format them into HTML (taking care to respect Unicode characters after the problems with Pankaj’s post in Hindi), and put that into a file that gets included via SSI on the home page. It works very nicely, and I’m happy to see a much smoother system for publishing and archiving news items.

As we approach the data freeze in Gramene, I’m more determined than in previous builds to make it mean something. It seems data keeps flying well after the freeze, but we’re going to have to make this stick this time if there is to be time to build CMap and the Marts in a reasonable time. We’re only 13% through 156 items in our “data” build, so I’m wondering how much gets jettisoned or done by then. I’m slowly getting more comfortable with my status as a “decider” and enforcer, so I’ll just have to bring down the hammer when the time comes. Watch out!



Leave a Reply