org.strbio.mol
Class AlignmentSet

java.lang.Object
  extended by org.strbio.mol.AlignmentSet

public class AlignmentSet
extends java.lang.Object

Class to represent a set of alignments (i.e. correct, calculated) between two polymers.

 Version 2.41, 9/9/99 - fixed to account for new MinareaResults class
 Version 2.4, 4/22/99 - fixed things to go with new Alignment, moved
   most stats into AlignmentStats class.
 Version 2.31, 3/31/99 - kludged avgScoreWithGaps, avgScoreWithoutGaps
 Version 2.3, 2/22/99 - fixed several things for consistency;
   changed last to calculated.
 Version 2.21, 12/2/98 - changed Alignment to AlignmentSet, and
   AlignmentVector to Alignment.
 Version 2.2, 11/4/98 - uses AlignmentVector objects instead of
   int[] to hold arrays
 Version 2.1, 11/2/98 - changed internal representation; last[]
   now stores last alignment loaded or created
 Version 2.01, 10/29/98 - added minareaUnSuperimpose
 Version 2.0, 10/19/98 - changed default load/save format to include
   minarea info, correct/predicted/naildown info
 Version 1.41, 9/30/98 - added minareaInfo, removed change of 1.3.
 Version 1.4, 9/29/98 - added saveCurrentCASP
 Version 1.3, 9/28/98 - added ratio output option to minareaAlign()
 Version 1.21, 7/22/98 - fixed bugs in minareaAlign()
 Version 1.2, 4/21/98 - works with Polymers instead of Proteins.
 Version 1.1, 4/13/98 - interface to minarea
 Version 1.01, 4/7/98 - bug fixes to stats()
 Version 1.0, 4/1/98 - original version
 

Version:
2.41, 9/1/99
Author:
JMC
See Also:
Protein, Polymer

Field Summary
 AlignmentStats alignmentStats
          alignment accuracy results; mostly cached values for things that can be calculated on the fly.
 Alignment calculated
          A vector containing the last alignment calculated with Align.
 Alignment correct
          A vector of correct sequence-fold monomer pairs.
 Polymer fold
          The fold being aligned.
 MinareaResults minareaResults
          minarea superposition results; assume fold rotated onto sequence.
 Alignment nail
          A vector of 'nailed down' seq-fold monomer pairs.
 Polymer seq
          The sequence being aligned.
 
Constructor Summary
AlignmentSet()
          making a new alignment sets everything to null.
AlignmentSet(Polymer s, Polymer f)
          make a new alignment with given seq, fold.
 
Method Summary
 double[] ASns(int tolerance)
          calculate alignment sensitivity, as in CASP2.
 double[] ASpc(int tolerance)
          calculate alignment specificity, as in CASP2.
 double averageScore(ScoreList sl)
          Returns average score of all pairs of aligned monomers.
 int[] calculatedFoldToSeq()
          Return a vector of which sequence is related to which fold monomer, in the calculated alignment.
 double calculatedRMS()
          return calculated RMS; assume molecules already superimposed.
 int[] calculatedSeqToFold()
          return vector of calculated alignment; which seq monomer is related to which fold monomer.
 int[] correctFoldToSeq()
          Return a vector of which sequence is related to which fold monomer.
 void correctlyAlign()
          re-align both sequence and fold to conform to the correct alignment.
 double correctRMS()
          return correct RMS; assume molecules already superimposed.
 int[] correctSeqToFold()
          Return a vector of which fold monomer is related to which seq.
 double globalAlign(AlignmentParameters ap)
          Do the global alignment, store in 'calculated', return comparison score.
 double globalCompare(AlignmentParameters ap)
          Find the global comparison score.
 void load(java.io.BufferedReader infile)
           
 void load(java.io.BufferedReader infile, PolymerSet seqs, PolymerSet folds)
          Load alignment out of new format file.
 void load(java.lang.String in_file)
          Load alignment out of new format file.
 void load(java.lang.String in_file, PolymerSet seqs, PolymerSet folds)
          Load alignment out of new format file.
 void loadCorrect(java.lang.String filename)
          load correct alignment from file.
 void loadNail(java.lang.String filename)
          load 'nail' alignment from file.
 void makeCalculatedFromCurrent()
          Sets up 'calculated' array based on current alignment.
 void makeCurrentFromCalculated()
          Sets up gaps in both Polymers as in 'calculated' array.
 void makeSameLength()
          Pad both sequences to same length.
 void minareaAlign()
          get 'correct' alignment of sequence and fold using minarea.
 void minareaInfo(Printf outfile)
          Shows some info from minarea: ratio score, RMS, and pct_id The minarea command is 'nw_minarea -A -g 0.1'.
 void minareaSuperimpose()
          Superimpose seq and fold.
 void minareaUnSuperimpose()
          Undo minareaSuperimpose()
 double[] pctRight(int tolerance)
          evaluates accuracy the calculated alignment.
 void printCalculated(Printf outfile, boolean showgaps)
          prints out both sequences in calculated alignment with alternating seq and fold lines.
 void printCalculatedOnCorrect(Printf outfile)
          prints out both sequences with alternating seq and fold lines, secondary structure information, relative shift info.
 void printCorrect(Printf outfile, boolean showgaps)
          prints out both sequences in correct alignment with alternating seq and fold lines.
 void printModeller(Printf outfile)
          prints both sequences in a format Modeller likes.
 void save(Printf outfile)
          Saves correct,calculated,nail alignments to a file in new format.
 void save(java.lang.String filename)
          Saves correct,calculated,nail alignments to a file in new format.
 void saveCalculated(java.lang.String filename)
          Saves the calculated alignment to a file in old format.
 void saveCalculatedCASP(Printf outfile)
          Save calculated alignment in CASP format, to open file
 void saveCalculatedCASP(java.lang.String filename)
          Save calculated alignment in CASP format
 void saveCorrect(java.lang.String filename)
          Saves the correct alignment to a file in old format.
 double[] shift()
          calculate average alignment shift, as in CASP2.
 void stats(Printf outfile)
          get statistics on the alignment.
 void stripGaps()
          Strip gaps from both seq and fold.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

seq

public Polymer seq
The sequence being aligned.


fold

public Polymer fold
The fold being aligned.


correct

public Alignment correct
A vector of correct sequence-fold monomer pairs.

See Also:
loadCorrect(java.lang.String)

nail

public Alignment nail
A vector of 'nailed down' seq-fold monomer pairs.

See Also:
loadNail(java.lang.String)

calculated

public Alignment calculated
A vector containing the last alignment calculated with Align.


minareaResults

public MinareaResults minareaResults
minarea superposition results; assume fold rotated onto sequence.


alignmentStats

public AlignmentStats alignmentStats
alignment accuracy results; mostly cached values for things that can be calculated on the fly.

Constructor Detail

AlignmentSet

public AlignmentSet()
making a new alignment sets everything to null.


AlignmentSet

public AlignmentSet(Polymer s,
                    Polymer f)
make a new alignment with given seq, fold.

Method Detail

pctRight

public final double[] pctRight(int tolerance)
evaluates accuracy the calculated alignment. returns a double[3] containing the percent correct, the number of correctly aligned positions (which is really an int converted to a double), and the total number of aligned positions (likewise).


ASpc

public final double[] ASpc(int tolerance)
calculate alignment specificity, as in CASP2. Returns a double[4] with the ASpc value, the ACrct value (correctly aligned positions), Na (aligned positions in this alignment), and NaC (aligned positions in the correct alignment)


ASns

public final double[] ASns(int tolerance)
calculate alignment sensitivity, as in CASP2. Returns a double[4] with the ASns value, the ACrct value (correctly aligned positions), Na (aligned positions in this alignment), and NaC (aligned positions in the correct alignment)


shift

public final double[] shift()
calculate average alignment shift, as in CASP2. Returns double[3] with average shift, total shift, and the number of positions compared.


correctFoldToSeq

public final int[] correctFoldToSeq()
Return a vector of which sequence is related to which fold monomer. Starts with 0; -1 indicates a gap.


correctSeqToFold

public final int[] correctSeqToFold()
Return a vector of which fold monomer is related to which seq. Starts with 0; -1 indicates gap.


calculatedSeqToFold

public final int[] calculatedSeqToFold()
return vector of calculated alignment; which seq monomer is related to which fold monomer. Starts with 0; -1 indicates gap.


calculatedFoldToSeq

public final int[] calculatedFoldToSeq()
Return a vector of which sequence is related to which fold monomer, in the calculated alignment. Starts with 0; -1 indicates a gap.


correctRMS

public final double correctRMS()
return correct RMS; assume molecules already superimposed.


calculatedRMS

public final double calculatedRMS()
return calculated RMS; assume molecules already superimposed.


load

public final void load(java.io.BufferedReader infile,
                       PolymerSet seqs,
                       PolymerSet folds)
                throws java.io.IOException
Load alignment out of new format file.

Throws:
java.io.IOException

load

public final void load(java.io.BufferedReader infile)
                throws java.io.IOException
Throws:
java.io.IOException

load

public final void load(java.lang.String in_file,
                       PolymerSet seqs,
                       PolymerSet folds)
                throws java.io.IOException
Load alignment out of new format file. If seqs/folds are given, seq and fold are found from these.

Throws:
java.io.IOException

load

public final void load(java.lang.String in_file)
                throws java.io.IOException
Load alignment out of new format file.

Throws:
java.io.IOException

loadCorrect

public final void loadCorrect(java.lang.String filename)
load correct alignment from file. The file format contains pairs of numbers indicating which residue in the sequence matches which residue in the fold. All un-paired numbers are considered to be gaps. Note that sequence numbering starts at 1 (damn those protein scientists) and bears no relation to the n field in Residue. Existing gaps (if any) are not counted in numbering; they get wiped out anyway whenever things get re-aligned. You can comment the file by beginning lines with #.


loadNail

public final void loadNail(java.lang.String filename)
load 'nail' alignment from file. Residues paired in the nail alignment cannot be moved during alignment. The file format is described in loadCorrect.

See Also:
loadCorrect(java.lang.String)

stripGaps

public final void stripGaps()
Strip gaps from both seq and fold.


makeSameLength

public final void makeSameLength()
Pad both sequences to same length.


makeCurrentFromCalculated

public final void makeCurrentFromCalculated()
Sets up gaps in both Polymers as in 'calculated' array.


makeCalculatedFromCurrent

public final void makeCalculatedFromCurrent()
Sets up 'calculated' array based on current alignment.


correctlyAlign

public final void correctlyAlign()
re-align both sequence and fold to conform to the correct alignment.


minareaInfo

public final void minareaInfo(Printf outfile)
Shows some info from minarea: ratio score, RMS, and pct_id The minarea command is 'nw_minarea -A -g 0.1'. Both sequences need atom information; this will be looked for in the PDB if not present. Warning... this deletes files in the current directory called check, data1, and data2 (which are produced by minarea for some reason). This does not actually align the 2 sequences; you need to call correctlyAlign() afterwards for that. Also, both sequences will be renumbered starting at 1 as a side effect.


minareaAlign

public final void minareaAlign()
get 'correct' alignment of sequence and fold using minarea. Both sequences need atom information; this will be looked for in the PDB if not present. Warning... this deletes files in the current directory called check, data1, and data2 (which are produced by minarea for some reason). Also, both sequences will be renumbered starting at 1 as a side effect.


minareaSuperimpose

public final void minareaSuperimpose()
Superimpose seq and fold.

See Also:
Polymer.minareaSuperimpose(org.strbio.mol.Polymer, org.strbio.mol.Polymer)

minareaUnSuperimpose

public final void minareaUnSuperimpose()
Undo minareaSuperimpose()

See Also:
Polymer.minareaSuperimpose(org.strbio.mol.Polymer, org.strbio.mol.Polymer)

averageScore

public final double averageScore(ScoreList sl)
Returns average score of all pairs of aligned monomers.


printCalculated

public final void printCalculated(Printf outfile,
                                  boolean showgaps)
                           throws java.io.IOException
prints out both sequences in calculated alignment with alternating seq and fold lines.

Throws:
java.io.IOException

printCalculatedOnCorrect

public final void printCalculatedOnCorrect(Printf outfile)
                                    throws java.io.IOException
prints out both sequences with alternating seq and fold lines, secondary structure information, relative shift info.

Throws:
java.io.IOException

printCorrect

public final void printCorrect(Printf outfile,
                               boolean showgaps)
                        throws java.io.IOException
prints out both sequences in correct alignment with alternating seq and fold lines.

Throws:
java.io.IOException

globalAlign

public final double globalAlign(AlignmentParameters ap)
Do the global alignment, store in 'calculated', return comparison score.


globalCompare

public final double globalCompare(AlignmentParameters ap)
Find the global comparison score.


save

public final void save(Printf outfile)
                throws java.io.IOException
Saves correct,calculated,nail alignments to a file in new format.

Throws:
java.io.IOException

save

public final void save(java.lang.String filename)
                throws java.io.IOException
Saves correct,calculated,nail alignments to a file in new format.

Throws:
java.io.IOException

saveCalculated

public final void saveCalculated(java.lang.String filename)
                          throws java.io.IOException
Saves the calculated alignment to a file in old format.

Throws:
java.io.IOException

saveCorrect

public final void saveCorrect(java.lang.String filename)
                       throws java.io.IOException
Saves the correct alignment to a file in old format.

Throws:
java.io.IOException

saveCalculatedCASP

public final void saveCalculatedCASP(java.lang.String filename)
                              throws java.io.IOException
Save calculated alignment in CASP format

Throws:
java.io.IOException

saveCalculatedCASP

public final void saveCalculatedCASP(Printf outfile)
                              throws java.io.IOException
Save calculated alignment in CASP format, to open file

Throws:
java.io.IOException

printModeller

public final void printModeller(Printf outfile)
                         throws java.io.IOException
prints both sequences in a format Modeller likes.

Throws:
java.io.IOException

stats

public final void stats(Printf outfile)
                 throws java.io.IOException
get statistics on the alignment. This is really for debugging purposes, and you shouldn't use it.

Throws:
java.io.IOException