Go to the source code of this file.
|
void | SetDefaultHhalignPara (hhalign_para *prHhalignPara) |
| FIXME.
|
|
int | PosteriorProbabilities (mseq_t *prMSeq, hmm_light rHMMalignment, hhalign_para rHhalignPara, char *pcPosteriorfile) |
| PosteriorProbabilities() calculates posterior probabilities of aligning a single sequences on-to an alignment containing this sequence.
|
|
double | PileUp (mseq_t *prMSeq, hhalign_para rHhalignPara, int iClustersize) |
| sequentially align (chain) sequences
|
|
double | HHalignWrapper (mseq_t *mseq, int *piOrderLR, double *pdSeqWeights, int iNodeCount, hmm_light *prHMMList, int iHMMCount, int iProfProfSeparator, hhalign_para rHhalignPara) |
| wrapper for hhalign. This is a frontend function to the ported hhalign code.
|
|
void | SanitiseUnknown (mseq_t *mseq) |
| get rid of unknown residues
|
|
◆ HHalignWrapper()
double HHalignWrapper |
( |
mseq_t * |
prMSeq, |
|
|
int * |
piOrderLR, |
|
|
double * |
pdSeqWeights, |
|
|
int |
iNodeCount, |
|
|
hmm_light * |
prHMMList, |
|
|
int |
iHMMCount, |
|
|
int |
iProfProfSeparator, |
|
|
hhalign_para |
rHhalignPara |
|
) |
| |
|
extern |
wrapper for hhalign. This is a frontend function to the ported hhalign code.
- Parameters
-
[in,out] | prMSeq | holds the unaligned sequences [in] and the final alignment [out] |
[in] | piOrderLR | holds order in which sequences/profiles are to be aligned, even elements specify left nodes, odd elements right nodes, if even and odd are same then it is a leaf |
[in] | pdSeqWeights | Weight per sequence. No weights used if NULL |
[in] | iNodeCount | number of nodes in tree, piOrderLR has 2*iNodeCount elements |
[in] | prHMMList | List of background HMMs (transition/emission probabilities) |
[in] | iHMMCount | Number of input background HMMs |
[in] | iProfProfSeparator | Gives the number of sequences in the first profile, if in profile/profile alignment mode (iNodeCount==3). That assumes mseqs holds the sequences of profile 1 and profile 2. |
[in] | rHhalignPara | various parameters read from commandline |
- Returns
- score of the alignment FIXME what is this?
- Note
- complex function. could use some simplification, more and documentation and a struct'uring of piOrderLR
-
HHalignWrapper can be entered in 2 different ways: (i) all sequences are un-aligned (ii) there are 2 (aligned) profiles. in the un-aligned case (i) the sequences come straight from Squid, that is, they have been sanitised, all non-alphabetic residues have been rendered as X's. In profile mode (ii) one profile may have been produced internally. In that case residues may have been translated back into their 'native' form, that is, they may contain un-sanitised residues. These will cause trouble during alignment
-
: introduced argument hhalign_para rHhalignPara; FS, r240 -> r241
-
: if hhalign() fails then try with Viterbi by setting MAC-RAM=0; FS, r241 -> r243
translate back ambiguity residues hhalign translates ambiguity codes (B,Z) into unknown residues (X). as we still have the original input we can substitute them back
◆ PileUp()
double PileUp |
( |
mseq_t * |
prMSeq, |
|
|
hhalign_para |
rHhalignPara, |
|
|
int |
iClustersize |
|
) |
| |
|
extern |
sequentially align (chain) sequences
- Parameters
-
[in,out] | prMSeq | holds the un-aligned sequences (in) and the final alignment (out) |
[in] | rHhalignPara | various parameters needed by hhalign() |
[in] | iClustersize | parameter that controls how often HMM is updated |
- Note
- chained alignment takes much longer than balanced alignment because at every step ClustalO has to scan all previously aligned residues. for a balanced tree this takes N*log(N) time but for a chained tree it takes N^2 time. This function has a short-cut, that the HMM need not be updated for every single alignment step, but the HMM from the previous step(s) can be re-cycled. The HMM is updated (i) at the very first step, (ii) if a gap has been inserted into the HMM during alignment or (iii) if the HMM has been used for too many steps without having been updated. This update-frequency is controlled by the input parameter iClustersize. iClustersize is the number of sequences used to build a HMM to allow for one non-updating step. For example, if iClustersize=100 and a HMM has been build from 100 sequences, then this HMM can be used once without updating. If the HMM has been built from 700 sequences (and iClustersize=100) then this HMM can be used 7-times without having to be updated, etc. For this reason the initial iClustersize sequences are always aligned with fully updated HMMs.
◆ PosteriorProbabilities()
int PosteriorProbabilities |
( |
mseq_t * |
prMSeq, |
|
|
hmm_light |
rHMMalignment, |
|
|
hhalign_para |
rHhalignPara, |
|
|
char * |
pcPosteriorfile |
|
) |
| |
|
extern |
PosteriorProbabilities() calculates posterior probabilities of aligning a single sequences on-to an alignment containing this sequence.
- Parameters
-
[in] | prMSeq | holds the aligned sequences [in] |
[in] | rHMMalignment | HMM of the alignment in prMSeq |
[in] | rHhalignPara | various parameters read from commandline, needed by hhalign() |
[in] | pcPosteriorfile | name of file into which posterior probability information is written |
- Returns
- score of the alignment FIXME what is this?
- Note
- the PP-loop can be parallelised easily FIXME
◆ SanitiseUnknown()
void SanitiseUnknown |
( |
mseq_t * |
mseq | ) |
|
get rid of unknown residues
- Note
- HHalignWrapper can be entered in 2 different ways: (i) all sequences are un-aligned (ii) there are 2 (aligned) profiles. in the un-aligned case (i) the sequences come straight from Squid, that is, they have been sanitised, all non-alphabetic residues have been rendered as X's. In profile mode (ii) one profile may have been produced internally. In that case residues may have been translated back into their 'native' form, that is, they may contain un-sanitised residues. These will cause trouble during alignment FS, r213->214
◆ SetDefaultHhalignPara()
void SetDefaultHhalignPara |
( |
hhalign_para * |
prHhalignPara | ) |
|
|
extern |
FIXME.
- Note
- prHalignPara has to point to an already allocated instance