Resubmitting failed SGE array tasks

Posted: December 1st, 2009 | Author: Shiran Pasternak | Filed under: Programming | Tags: , , , | No Comments »

I can't seem to find a straighforward mechanism of resubmitting specific tasks in a Sun Grid Engine (SGE) array job, so I rolled my own. There's not much to it, but it's ideal for small array jobs where the failed tasks can be specified by hand.

Read the rest of this entry »


Forecast: Cloudy

Posted: November 30th, 2009 | Author: Shiran Pasternak | Filed under: Big Data | Tags: , , | 4 Comments »

David Dooling, of the Genome Center at Washington University and the blog PolITiGenomics, has written a thoughtful post about the nexus of cloud computing and bioinformatics. Dooling does a fine job summarizing the cloud computing and Big Data conversation that’s happening in bioinformatics circles, so I won’t repeat it here.

The post essentially argues that bioinformaticians are — to mix metaphors — too quick to jump on the cloud computing band wagon. It often makes more sense — and cents — to analyze genomic data locally (or privately). His cost-benefit analysis is spot on, but applies to the state of the art. The state of bioinformatics is, however, changing very rapidly and the scale (pun very much intended) is likely to tip.

Read the rest of this entry »


Moving to Subversion

Posted: October 31st, 2008 | Author: Shiran Pasternak | Filed under: Uncategorized | No Comments »

Recently, we decided to migrate our entire codebase from CVS to Subversion (SVN). Ken did most of the groundwork. I am really excited about our new version control setup, and have been dreaming about this for a long time. But to butcher a biblical metaphor, I only played Aaron to Ken’s Moses.

Read the rest of this entry »


The Mantis Issue Workflow

Posted: April 22nd, 2008 | Author: Shiran Pasternak | Filed under: Uncategorized | Tags: | 16 Comments »

After some discussion with some lab members, I’d like to propose a workflow for Mantis issues that aims to ensure that tasks are assigned, completed, and verified, and that effective notifications are subsequently sent to stake holders.

Read the rest of this entry »


Standardizing the GeneBuilder, Part I

Posted: April 9th, 2008 | Author: Shiran Pasternak | Filed under: Uncategorized | 3 Comments »

The past few days I’ve been familiarizing myself and at the same time migrating Chengzhi’s GeneBuilder into the now-mainstream annotation pipeline. Chengzhi has done a great job annotating various genomes with evidence-based genes. But with looming deadlines and the need to streamline the process, I’ve been tasked with migrating the GeneBuilder into a shared automated pipeline that can be readily run by any user with less upfront effort. This post is a debrief of some of the issues that I’ve encountered so far and some modifications I’m making that are hopefully improving things. I hope to solicit some comments and discussion through this post.

Read the rest of this entry »


The Curse of Compound Statements

Posted: December 20th, 2007 | Author: Shiran Pasternak | Filed under: Programming | Tags: , , | 2 Comments »

Pop quiz, hot-shot: What does the following Perl code do?

1:    #!env perl
2:
3:    my %H        = ('a' => 0, 'b' => 0);
4:    my $skipped  = 0;
5:    my $accessed = 0;
6:    my $total    = 0;
7:    for my $key (keys %H) {
8:        $total++;
9:        my $value = $H{$key} || ($skipped++ && next);
10:       $accessed++;
11:   }
12:   print "Accessed: $accessed\n";
13:   print "Skipped:  $skipped\n";
14:   print "Total:    $total\n";

Read the rest of this entry »