Why are our bioinformatics workflows so complicated!
Last week to answer one question I had to resort to information from several sources . A lot of them contributed immense value to my “workflow” and were also either difficult to perform or very easy. For a start I have ranked them in terms of both Value ( 1 for no value to 10 for a lot of value) to ease of use ( 1 for very complicated to 10 for very easy)
# Assembling my sequences in DNAstar (Value 10 : Ease 7 )
# Compiling my sequences and pulling them into Jalview. Ran CLUSTALW web service on edited alignments and realized that all of my clones had basically two sequences for their CDRs . . Jalviews excellent web-service CLUSTALW interface allowed me to quickly edit the 32 sequences , align them interactively and realize they belonged to two types. This got me thinking that maybe the primers I used to clone my CDRs from my mouse kappa light chains were probably mis-priming ( Value 10 : Ease 9)
# Use pubmed to look at precedents i.e analyze all possible papers which had sequenced the mouse anitbody kappa light chain CDR region as I had attempted to do and derive the sequences of the primers they had used. It took forever to get the right keywords to query and I still have only three kappa light chain primer sequences. ANd they are all different! ( Value 10 : Ease 1 ),
# Use my primer sequences , compare them with the literature and figure out how I had misprimed and why my sequences were all either of two types ( Still in progress Value immense : Ease 1 i.e still difficult to do)
# Use pubmed / NCBI genome to understand the sequence space for mouse kappa light chains ( Value 10 , Ease 4 , )
This is still work in progress . But to summarize –
The pubmed steps were the most painful . Pubmed search has to improve!.
Jalview contributed the most value. For a free App its a must have in any bioinformatics toolkit!. DNAstar played its role ..but for its cost ( a few thousand dollars )! It sure gave a lot less value than Jalview
All of this begs the question! ..why are bioinformatics workflows so difficult! We are a long ways away from making these things easy to do for everyone!