Optimal Secondary RNA Alignment and Nussinov’s Algorithm

Example of Nussinov’s algorithm in action. (Image Source)

Terminology

RNA Terms

  • RNA: ribonucleic acid comprised of the 5-carbon sugar, ribose, which is attached to an organic base
  • Uracil: a non-methylated nucleotide base in place of thymine which provides flexibility to RNA
  • Riboswitches: a regulatory segment of mRNA molecule that binds a small molecule
  • microRNAs: novel non-protein layers of gene regulation
  • piRNA: largest class of small non-coding RNA molecules in animals — silencing of transposons, epigenetic modifications, and post-transcriptional gene silencing
  • lncRNAs: long transcripts produced operating functionally as RNAs — not translated to proteins

RNA Structure

Visual representation of RNA structures. (Image Source)
  • Primary: sequence in which the bases (U, A, C, G) are aligned
  • Secondary: 2-D analysis of hydrogen bonds between different parts of RNA — double-strand
  • Tertiary: complete 3-D structure of RNA — “how string bends where it twists”

Secondary Structure

  • A continuous backbone is defined when A(i, i+1) = 1 for all i from 1 to n.
  • For each i from 1 to n, there is at most one A(i, j) = 1 where j <i-1. This means a base only forms a pair with one other at the time.
  • We ignore pseudo knots by saying if A(i,j)=A(k,l)=1 and i < k < j, then i < l < j.

Nussinov’s Algorithm

Goal

Example of an arbitrary model. (Image Source)
  • The scoring scheme of the model follows an additive property.
  • Psuedoknots were not allowed, which means the RNA can be split into two smaller ones that are independent of one another.

Algorithm — Initialization

Example initialization matrix.

Algorithm — DP

  • With a subsequence from i to j, there is either no edge connecting the ith base (unpaired) or there is some edge connecting the ith based to the kth base where i < k ≤ j.
  • If the edge is unpaired, the energy of subsequence E(i,j) reduces to the energy of a subsequence E(i+1, j).
  • If the edge is paired, then the energy of subsequence E(i,j) becomes E(i, k-1) + E(k+1, j) + the energy contribution of the i,k pairing.
Intuition behind Nussinov’s algorithm. (Image Source)

Applications

TL;DR

  • Nussinov’s algorithm is a method to calculate the secondary sequence of RNA from a primary sequence using dynamic programming to calculate the minimum free energy.
  • Nussinov’s algorithm only works in specific conditions, such as without pseudoknots and with small numbers of base pairs.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store