Liya’s trip to CSHL

November 14th, 2008 by Liya Ren

I visited the lab last Friday (November 7th). Since it was Friday, I didn’t meet many colleagues. The lab office was in remodeling. I’ll go to the lab next week (November 19 possibly) and some telecommuters will be there too !
Doreen had spent the whole day with me. We discussed my objectives for the next several months. First we discussed the protein annotation and xrefs. Will joined us on the phone. We set up a document on protein annoation and xref pipeline in order to have consistency when running the pipeline by different groups. We looked into the reason of inconsistent GO annotation results on rice, maize and sorghum. One reason is the inconsistent annotation. Rice has better annotation than maize and sorghum. It has more Uniprot protein annotations. Also it is possible the annotation pipeline uses different version of Interpro database and xref source databases. Another reason is biological, e.g the corresponding region becomes a partial gene or an intron.
We also discussed some compara analysis with Josh by phone. Josh wrote a very informative word document on the research objectives. What I need do is finding the orthologue dataset first. The summer interns have a perl script to get the orthologues from compara database by using ensembl API. I have used this script to get the orthologues. I’d like to update the script so the script is more generic. I also like to take a look of the mart schema to see the underline structure. The next item to do is looking for the synteny region beteen rice and sorghum. Jack Chen’s lab has developed OrthoCluster (Ismael Vergara) to get the synteny region. Josh also points out the DiagHunter and SyMap.
The third item on the do list is about microRNA targets and pathway analysis and enrichment for GO categories. Read Chris’s paper of ‘Identifying microRNAs in plant genomes.’. Get the protein targets for the microRNAs from Lifang and identify possible pathways and GO categories.
The trip is very objective-oriented. I’ll visit the lab more often from now on.

HDF5 and next-gen data

November 10th, 2008 by Jer-Ming Chia
I’ve realized that over the last few months I have become de-sensitized to large numbers. It’s either because of the time spent working on the next-generation sequencing data or the never-ending 0s being added to the bailout package.

Read the rest of this entry »

Moving to Subversion

October 31st, 2008 by Shiran Pasternak

Recently, we decided to migrate our entire codebase from CVS to Subversion (SVN). Ken did most of the groundwork. I am really excited about our new version control setup, and have been dreaming about this for a long time. But to butcher a biblical metaphor, I only played Aaron to Ken’s Moses.

Read the rest of this entry »

Ken’s scary Halloween update

October 31st, 2008 by Ken Youens-Clark

It’s Halloween and hot in Texas, and I’m blogging. Read the rest of this entry »

GO Slim

October 21st, 2008 by Liya Ren

Gramene uses GO terms to annotate proteins and genes. I am familiar with GO terms since I am responsible for developing and maintaining the ontology database and loading all kinds of ontology terms and annotations. I knew little about the GO slim terms. Doreen asked me to generate the GO and GO slim annotations for rice and arabidopsis in this summer. Until then I began to touch the GO slim terms.

GO slim terms are subset of GO terms. It is still a ontology and a DAG. Initially I thought GO slim terms are bunch of ancestor terms without relationship among them. But they are not. Because the GO ontology is a DAG and the GO slim is also a DAG, one term used in annotation may map to several slim terms via different paths. When I looked at the slim example graph It is quite interesting for node 9 and 10. Node 9 seems mapping to slim node 3 and 4. But node 3 is not because it is a parent term of the slim node 4. Node 4 is the most pertinent term of node 9 comparing to node 3. Node 10 can get to slim node 2 and node 3 via two paths( 10->5->2 or 10->8->6->3) . Since slim node 2 and node 3 have no hierarchical relationship, node 10 can map to both slim node 2 and node 3.

Gramene ontology database stores the GO terms and the relationship of these terms. It does not store the slim terms and relationships. I can’t use the gramene ontology database to do the GO slim mapping. I find Chris Mungall already has one script map2slim to do the work. It is easy now since Shuly has already installed the go-perl package for amigo on our machine, and I have the ontology cvs which contains the GO ontology and GO slim ontology terms. The map2slim need the input file in specific GO annotation format. What I have to do is to write a generic script to get the ensembl gene annotations from ensembl databases and generate the result in the GO annotation format. My work is done ! Later I’d like to look into the map2slim codes to see what the code does when the relationships in the slim ontology is not consistent with the relationships of the GO ontology, e.g. a slim term becomes obsolete in the GO ontology. Now the script just fails.

Optimizing MySQL server variables on our production environment

October 19th, 2008 by Shuly Avraham

This weekend I’ve been working on tuning the MySQL server parameters, In order to optimize MySQL on our production environment, so called ‘flume’. Read the rest of this entry »

Will - It’s late on Friday night…

October 17th, 2008 by William Spooner

It’s late on stormy Friday night in mid October, probably a good time for my inaugural post to this blog. In particular as I have done very little in the way of plant informatics work recently.

So; what has been on my plate? I’ve contributed a chunk of web code to the (long delayed) Ensembl 51 release. This is a page for displaying gene trees. I’ve been promising various Ensembl-ites that I would work this up to for many years, so this is my special gift for their half-century. I’m not going to try to explain what it looks like, or provide a link (it’s not released yet), sorry. I am, however, quite pleased by the end result!

But most of my time recently has been taken up with writing the ‘WormMart’ paper in the last few weeks that I have working on that project (WormBase is off to Canada without me). WormMart, for those not intimately familiar with resources for the study of the biology of nematodes, is my (BioMart-based) data warehouse for WormBase and is, as such, similar to GrameneMart. So I’ve been buried in bioinformatics database papers in an effort to absorb sufficient of their nuances for me produce a decent forgery.

Ken’s weeks ending 10/17/08

October 17th, 2008 by Ken Youens-Clark

I missed posting last Friday because I was out, Read the rest of this entry »

Ken’s week ending 10/3/08

October 3rd, 2008 by Ken Youens-Clark

Now that Gramene’s 28 release is mostly done (I’m rebuilding CMap with UniGenes and need to figure out the stacked maps still), I’ve been on a modernization and reorganization kick. Read the rest of this entry »

A new Gramene release

September 16th, 2008 by Ken Youens-Clark

Build 28 of Gramene is finally out Read the rest of this entry »