Input File Formats
A bpseq file contains information regarding the secondary structure of RNA molecules. There a three columns within a bpseq file. The first column lists the nucleotide number for each base present in the 5' to 3' direction. The second column lists the nucleotide type for each base (A, U, G, C) and the third column lists the nucleotide number of the base that is paired to that base (0 if the base is unpaired).
Below is an example RNA secondary structure (left) and its corresponding bpseq file (right). For RAG Sampler and RAG Builder, the file needs to have the extension .bpseq.
Adjacency MatricesThe first step in design is to upload the adjacency matrix of the target RNA tree graph. The adjacency matrix describes the connectivity of the vertices in the tree graph. In other words, an adjacency matrix explicitly states which vertices are connected to one another. Below is an example of an RNA tree graph (left) and its corresponding adjacency matrix (middle). The first row in the table represents the first vertex and the second row represents the second vertex and so on.
For RAG Designer, the vertices should be numbered in an increasing order from 5' to 3' direction, with the first vertex as the 5'/3' end vertex with one connection. This specifies the order of loops in the designed RNA sequence. An example designed sequence and its 2D structure that corresponds to the tree graph topology is also shown (right). Note that more than one loop order, each with multiple 2D structures can correspond to the same tree graph topology.
Output File Formats
RAG Sampler score file
The score file is provdided as a TXT file. The file contains one column which corresponds to the score for generated candidate 3D graphs.
RAG Builder score files
The score files are provided as TXT files. The 'Lowest Scores' file contains atomic model number for the selected best models, along with their nucleotide number and score. The 'Model Scores' file contains the model number, graph RMSD of their 3D graph from the target graph, score, steric-clash information, and nucleotide number for all generated models.
RAG Designer sequence and score files
The sequence and score files are provided as TXT files. The 'Correct Sequences' file contains the sequences that fold onto the target topology, along with their nucleotide number, and score and model number for the corresponding atomic model. The '200 Lowest Scores' file contains the same information for the top 200 (or fewer) unique sequences, and their graph topology as predicted by RNAfold and NUPACK (which may or may not be the target topology). The 'Model Scores' file contains the model number, score, steric-clash information, and nucleotide number for all generated models/sequences.
Each 2D structure file is generated in BPSEQ format (see above). All 2D structure files are compressed and provided as a .zip file for ease of download.
3D Graphs and Atomic Models
Each 3D graph and 3D atomic model is generated in PDB format (see PDB file format). All graph and models files are compressed and provided as a .zip file for ease of download.