Cross pair avarage versus the cross pair knot masked

jandersonlee · October 8, 2024, 10:08pm

Is there some javascript code somewhere or even a brief pseudo algorithm that descibes how to compute the cross pair avarage versus the cross pair knot masked?

LFP6 · October 8, 2024, 10:20pm

Are you referring to the new confidence metrics? If so, here’s where you can find the implementation:

github.com/eternagame/EternaJS

src/eterna/folding/FoldUtil.ts

1ecbdc433


      
          public static bppConfidence(
              targetPairs: SecStruct,
              dotArray: DotPlot | null,
              bppBehavior: BasePairProbabilityTransform
          ): { mcc: number, f1: number } {
              if (dotArray === null || dotArray.data.length === 0) return {mcc: 0, f1: 0};
          
              const dotMap: Map<string, number> = new Map<string, number>();
              const pairedPer: Map<number, number> = new Map<number, number>();
          
              for (let jj = 0; jj < dotArray.data.length; jj += 3) {
                  const prob: number = bppBehavior === BasePairProbabilityTransform.LEAVE_ALONE
                      ? dotArray.data[jj + 2]
                      : (dotArray.data[jj + 2] * dotArray.data[jj + 2]);
          
                  if (dotArray.data[jj] < dotArray.data[jj + 1]) {
                      dotMap.set([dotArray.data[jj], dotArray.data[jj + 1]].join(','), prob);
                  } else if (dotArray.data[jj] > dotArray.data[jj + 1]) {
                      dotMap.set([dotArray.data[jj + 1], dotArray.data[jj]].join(','), prob);
                  }

This file has been truncated. show original

github.com/eternagame/EternaJS

src/eterna/folding/FoldUtil.ts

1ecbdc433


      
          public static pkMaskedBppConfidence(
              pairs: SecStruct,
              dotplot: DotPlot | null,
              bppBehavior: BasePairProbabilityTransform
          ): { mcc: number, f1: number } {
              if (dotplot === null || dotplot.data.length === 0) return {mcc: 0, f1: 0};
          
              // With this formulation, we are effectively saying any nucleotides not predicted to be
              // part of a cross pair are "correct", as we specify they should be unpaired
              // and there are no pairing probabilities at that position > 0. This is specifically tuned
              // for F1, in which the "true negative" component is disgarded in its formula.
              // As put by Rhiju:
              // The motivation for this choice is that in cases with inferred pseudoknots,
              // we really don't care about what residues outside the relevant crossed-pairs are doing.
              // We just want an estimate of whether the specific crossed-pairs will show up in the actual
              // structure. There is an analogy to how we set up the 'crossed pair quality' component for OpenKnotScore.
              const crossedPairs = pairs.getCrossedPairs();
          
              const bpps = dotplot.data.slice();
              for (let i = 0; i < bpps.length; i += 3) {

This file has been truncated. show original

jandersonlee · October 8, 2024, 10:35pm

@LFP6 Thanks. (I think.) Looks Greek to me but I’ll try and figure it out.

LFP6 · October 8, 2024, 10:48pm

Happy to answer questions if I can. It’s probably more intuitive to look at Arnie’s implementation of bppConfidence (which it calls expected accuracy - I’ve changed that terminology since “accuracy” is a specific statistical measurement which I want to avoid confusion with):

github.com/DasLab/arnie

src/arnie/utils.py

e2455d14c


      
          def get_expected_accuracy(dbn_string, bp_matrix, mode='mcc'):
              '''given a secondary structure as dbn string and base pair matrix, 
              assess expected accuracy for the structure.
          
              Inputs:
              dbn_string (str): Secondary structure string in dot-parens notation.
              bp_matrix (NxN array):  symmetric matrix of base pairing probabilities.
              mode: ['mcc','fscore','sen','ppv']: accuracy metric for which to compute expected value.
          
              Returns: expected accuracy value.
              '''
          
              assert bp_matrix.shape[0] == bp_matrix.shape[1]
              assert bp_matrix.shape[0] == len(dbn_string)
          
              struct_matrix = convert_dotbracket_to_matrix(dbn_string)
              N = len(dbn_string)
          
              pred_m = struct_matrix[np.triu_indices(N)]
              probs = bp_matrix[np.triu_indices(N)]

This file has been truncated. show original

Note that Arnie deals with a symmetric matrix (unlike Eterna). Also keep in mind that in Eterna, there’s currently some frustrating inconsistencies/bugs between Vienna and other engines: Inconsistent dot plot/pairing probabilities behavior between Vienna/Vienna2 and everything else · Issue #803 · eternagame/EternaJS · GitHub

jandersonlee · October 9, 2024, 5:39am

@LFP6 Thanks for the pointers. I think I have working code now. At least it seems to be giving the same values as the widgets in the lab tool. Intgerating it into arcknot and getting that to work with RNNet. Can display RNNet knots now, but the mutation code is broken. Poco a poco.