New graphic view of BioLabDonkey user interface

Next, version 3.0 of BioLabDonkey is in macAppStore. It has new, clean graphic user interface. Here below you can see the description of the BioLabDonkey functionality for old text based user interface. This description is correct for the new version except that text based buttons are exchanged to pictograms.

Assembly

Important. The assembly can be efficiently performed for up to several hundreds reads, not more!

“open” button – opens files with extensions: .ab1, .abi, .txt.
“save” button – saves the selected contig consensus to file with .txt extension.
“delete” button – delete the selected read files in the files table.
“clear” button – delete all read files in the files table.
“print” button – use this button to print the assembly graphic map or to save it into pdf or eps file.

  • Reads files table
  • Assembly – graphic map, assembled reads, consensus and coverage

Reads files table
Press the “open” button to open a file or multiple read files (.txt, .abi, .abi). The file names will appear in the column “File Name”. To delete the files select the rows and press the “delete” button. To clear the files table press the “clear” button.

Assembly – graphic map, assembled reads, consensus and coverage
To start assembly select the minimal reads overlap value, then press the “assemble” button. Play with this value to get an appropriate result. To cancel assembly press the “cancel” button.
The assembly result will appear as a set of contigs in the contigs table. Select a contig in the table to see the assembly map. In the map each arrow represents a sequence file. The green arrow corresponds to an original file sequence, and the red arrow represents a reverse complement sequence. Click on an arrow to see the corresponding file in the files table.
For .ab1, .abi files a double click on the selected file row opens a new window with ABI chromatogram. 
If the contig length is less than 3000 bp the reads sequences and consensus are displayed. For longer contigs click and drag on the assembly map to see the reads sequences and consensus in the corresponding region. To shift the view frame to the left or right side use the “<” and “>” buttons.
To return to whole contig view use the “return” button.

Below the assembly map, each vertical red/green bars represents an assembly coverage for each nucleotide. The red bars mean nucleotide variations. On the assembly map these variations are displayed as vertical dash red lines. Place the mouse pointer over a bar to see the coverage number and the nucleotide variation. This functionality is available when the contig frame is less than 3000 bp. 
Press the “save” button to save a contig consensus sequence to .txt file. To open a consensus sequence in the “DNA1”, “DNA2” or “Alignment” Tab press corresponding buttons. 

Privacy Policy (PathFolding, PathFoldingPro)

Collection and Use of Personal Information. No personal information (data) that can be used to identify or contact a single person is collected.

Collection and Use of Non-Personal Information No non-personal information (data) is collected.

Cookies and Other Technologies No cookies and other technologies are collected.

Disclosure to Third Parties and Service Providers The app does not have anything for user profiling.

Third‑Party Sites and Services The app has no connections and no links to third-party services.

Ⓒ 2019 PathFolding, PathFoldingPro, Valeriy Tarasov

Prediction of protein secondary structure in BioLabDonkey

Prediction of protein secondary structures as sequence pattern search The secondary structures sequences can be extracted from PDB files and the patterns generated from these sequences can be used for the match search in a target sequence.

Problem with the prediction using patterns The protein patterns database is not feasible to generate and to use because of too many pattern variants. For alpha helix pattern of 8 aa long (about two turns) the maximal number of pattern variants is 8 in power of 20 (10 in power of 18).

Solution to the problem of high number of patterns Amino acids can be grouped according to their physicochemical properties – hydrophobicity, negative/ positive charge and etc. 20 amino acids can be split into 8 groups and for 8 aa alpha helix pattern of new code the maximal number of variants is 16777216 (8 in power of 8). The real number of alpha helix patterns should be smaller than that. The following grouping was used in BioLabDonkey Version 1.0:

Standard codeNew code
1. V, I, L W, F, C– very hydrophobic –W
2. A, M – less hydrophobic –V
3. N, Q, S, T, Y – polar neutral – O
4. D, E– negatively charged –N
5. K, R– positively charged – P
6. HB
7. GF
8. PS

For this grouping the prediction had more false positive results. The better outcome is seen for the following grouping implemented in the update of BioLabDonkey, in Version 1.1 (from 23.10.2019):

Standard codeNew code
1. V, I, L, F, C, M– hydrophobic –W
2. AV
3. S, T – polar, hydroxylic – O
4. D, E– negatively charged –N
5. K, R– positively charged – P
6. HB
7. GF
8. PS
9. W, Y – polar, aromatic – A
10. N, Q -polar, acidic – Q

Generation of database for alpha helices, beta strands and turns from PDB Alpha helix . The minimal pattern size for alpha helix was set to 8 aa – about two turns. The octamer is considered as a minimum for a stable alpha helix. The octamer patterns were extracted from alpha helix regions in PDB. Beta strand. The beta strand sequences were taken as they are present in PDB. Turn. The turn patterns were set as the sequences of 4 aa long, including glycine or proline.

The database was generated from the following organisms: Saccharomyces cerevisiae,
Helicobacter pylori,
Klebsiella pneumoniae,
E.coli,
Mycobacterium tuberculosis,
Pseudomonas aeruginosa, Salmonela typhimurium,
Staphylococcus aureus,
Streptococcus pneumoniae,
Vibrio cholerae,
Bacillus subtilis,
Homo sapiens

The number of generated patterns in BioLabDonkey database Version 1.1 (from 23.10.2019): alpha helix – 200155 , beta strand – 26054

From the mechanism of cotranslational folding, as a helix formation can happen inside the ribosomal exit tunnel, before the beta sheets, the alpha helices were searched first. The sequence regions not occupied by the alpha helices were searched for beta strands. For turn patterns the sequence regions free of alpha helices or beta strands were tested.

Evaluation of the prediction accuracy (random examples)

1. Comparison of the prediction with the secondary structure from pdb when the patterns database does not include the patterns from this pdb. The good prediction is expected to have as less as possible both false negative and the false positive secondary structures.

D-ornithine/D-lysine decarboxylase from Salmonella typhimurium

Xenopus laevis MHC I complex

2. Comparison of the prediction with the secondary structure from pdb when the patterns database include the patterns from this pdb. The good prediction is expected to have as less as possible the false positives secondary structures.

F41 fragment of flagellin of Salmonella typhimirium

3.Comparison of the prediction with the results of other algorithms – machine learning–based techniques.

Comparison with Jpred4 ” (no similarity to sequences with known PDB) for “NgrC protein” (from Providencia stuartii plasmid pTC2 )

Comparison with Jpred4 (no similarity to sequences with known PDB) for “hypothetical protein” (from Providenciastuartiiplasmid pTC2 )

Privacy Policy

Collection and Use of Personal Information
No personal information (data) that can be used to identify or contact a single person is collected.

Collection and Use of Non-Personal Information
No non-personal information (data) is collected. User can generate database txt files for DNA features annotation and protein secondary structure prediction. These files are stored inside the program and can be exported/imported by user.

Cookies and Other Technologies
No cookies and other technologies are collected.

Disclosure to Third Parties and Service Providers
There is no disclosure to third parties and service providers

The Existence of Automated Profiling
The program does not have anything for user profiling.

Third‑Party Sites and Services
The program contains links to third-party public websites related to the DNA, RNA and protein analysis.

Ⓒ 2019 BioLabDonkey, Valeriy Tarasov

Info/Calculators

  • Dilution calculator
  • Molarity calculator
  • Ligation calculator
  • SDS PAG calculator
  • Nucleic acid OD conversion
  • Ammonium sulfate calculator
  • Bacteria growth calculator
  • Units conversion

Dilution calculator
Use this calculator to calculate the amount of stock solution needed to prepare a solution of the desired molar concentration. 

Molarity calculator
Use this calculator to calculate the required amount of chemical substance (in gram) to add in order to prepare a solution of desired molar concentration.

Ligation calculator
Use this to calculate the required amounts of vector and insert DNA needed to achieve a particular ratio of insert to vector.

SDS PAG calculator
Use this calculator to calculate the amounts of components needed for preparing the separating and stacking gels of a discontinuous gel.

Nucleic acid OD conversion
Use this calculator to calculate the concentration of nucleic acids in a solution (in microgram per millilitre). Path length is the width of cuvette, which is commonly 10 mm (1 cm). 

Ammonium sulfate calculator
Use this calculator to calculate the final volume of the solution and the amount of sulfate (in gram) needed to add to the initial solution. The calculation can be performed for different temperatures. 
The calculator is based on the info from the following publication. Paul T. Wingfield. Protein Precipitation Using Ammonium Sulfate. Curr Protoc Protein Sci. 2001 May. APPENDIX 3: Appendix–3F.doi:10.1002/0471140864.psa03fs13.

Bacteria growth calculator (for the exponential phase) 
Use this calculator to estimate the time required until a desired cell density is reached (expected OD). 

Units conversion
Use this calculator to convert gram to mol, and mol to gram, as well as to calculate the corresponding number of molecules of the nucleic acid or protein.

Gel/Blot/Cells/Colonies

In this window, the relative intensities of bands (on gels or blots) can be measured, and the number of colonies (on agar plates), cells or spots (in a microscopic field) can be determined. 
To open an image file, drag and drop the file into the image pane.  
Use the “print” button to print or to save into pdf, PostScript file the content of the image view or the table of folds.

  • Image adjustment
  • Labels
  • Grid for cells counting
  • Calculation of relative fold-change in selected bands/spots
  • Colonies/Cells/Spots counting

Image adjustment
To adjust the image use the “contrast” and “brightness” sliders. Use the “convert to grey”, “invert color” buttons to change the image correspondingly. To cancel all changes use the “cancel” button.

Labels
To label features or to write a title on the picture, enter text into the “label” field and press the button “paste”. The text can be moved to the required position by dragging it with the mouse (or trackpad). Use the color well to set the label color, and the “font size” popup menu to adjust the font size. To remove all the labels, use the “remove” button 

Grid for cells counting
To place a grid over the image, set the On/Off grid checkbox to On. The spacing of the grid lines can be adjusted with the scale slider. To set the grid color, use the color well.

Calculation of relative fold-change in selected bands/spots
Select the bands/spots to analyse by using the cursor to drag rectangle frames around them. Do this by clicking and dragging with the mouse down. To change the frame color, use the color well. 
To move a created frame, click inside the frame and drag.
To remove the frame, double-click inside the frame. To remove all frames use the “remove” button.
Before the calculation, select the aligned frames on interest, vertical or horizontal, e.g. whether in a gel lane or across different gel lanes. The frames will be numbered according to their vertical or horizontal alignment. To remove the alignment numbers, press the button “none”. If a frame is moved, the corresponding frame alignment numbers will be automatically redrawn according to the new frame positions.
Select the image background to be dark or light.
Use the “calculate” button to calculate the bands/spots intensities and fold-density changes. For each frame the local background will be subtracted. Press the button “remove” to clear the table and to remove the frames.

Colonies/Cells/Spots counting
The counting can be done for the whole image area or in the selected sectors. To draw the sectors switch on the “draw sectors” checkbox, then click around the target area. By this an open contour will be drawn. Double click to close the open contour and this will automatically generate a sector name in alphabetical order. 
The counting results for each sector and for the rest area as “Rest” will be displayed in the table. If no sectors are created the counting will be displayed in the table for the whole image area under the name “All”.
Use the sectors color well to set the color for the sectors.
To remove all sectors and to remove from the table the corresponding counts use the “remove all sectors” button. To remove a single sector set the “draw sectors” checkbox off and double click inside the sector. All the counts in the table will be automatically corrected. 
All counted colonies/cells/spots will be outlined. Use the contours color well to set the color for the contours. Falsely counted objects can be removed by outlining them (set the “draw sectors” checkbox off, click and drag a rectangle over them), and double clicking inside the outlined region (frame). After removal, the count will be immediately corrected. To remove all the contours and to clear the count use the “remove all contours” button.
      Colonies counting 
To count colonies on a culture plate, select the “colonies” section on segmented button (default). Then, convert the image into grey scale mode and press the “count” button. If the default settings of colony shade and background color are not optimal, then adjust the colony shade by mouse clicking inside a colony, and the background color by clicking outside the colonies. Adjust the colony color brightness using the HSB slider and press the “count” button again to see the new count. The counted colonies will be outlined. Merged colonies are not counted and can be added to the above single colony count by drawing a frame around the each colony within a close pair or cluster, in the same way as described above for the gel bands. To remove all the frames and to correct the counts use the “remove all frames” button. To remove a single frame, double-click inside the frame.

The picture of yeast colonies on an agar plate is courtesy of Mike Dyall-Smith. 


      Cells/Spots counting 
To count cells (or spots within cells), select the “cells” section on segmented button and set the cells/spots color as well as the background color. To set the colonies/cells/spots color activate the colonies/cells color well and click on a colony/cell. Use HSB slider to reduce the brightness for a grey scale image or both brightness and saturation for the color image. To set the background color activate the background color well and click on the image background. The next steps are the same as for the described above colony count. Play with brightness and saturation to get the best results.

The picture of immunofluorescently labeled cells is courtesy of Dr. Dmitri Lodygin, University of Goettingen. The cyan coloured spots are counted and outlined in blue.

Alignment

“open” button – opens files with extensions: .txt, .fasta for the table of sequences to be align; .aln, .blalignment to display the aligned sequences.
“save” button – saves the aligned sequences without names to files with extensions: .blalignment or .pdf, .eps.
“close” button – closes file and clean the window of aligned sequences.
“print” button – prints or saves the aligned sequences with names into pdf or eps file.

The blalignment file format is a text file format to open/save the aligned DNA/Protein sequences. 

  • Table of DNA or protein sequences to align
  • Multiple alignment
  • Realignment
  • Phylogenetic tree

Table of DNA and protein sequences to align
To open a sequence file (.txt, .fasta) in the table use the “open” button. The file name will appear in the column “Name” and the sequence in the column “Sequence”
Use the “add” button to add an empty row. Then, paste a name and a sequence into the corresponding columns. To delete a row, select this row and press the “delete” button. The multiple rows can also be selected and deleted. 

Multiple alignment
Multiple alignment can be performed for DNA or protein sequences using two custom algorithms (BioLabDonkey1,2), classical Needleman-Wunsch algorithm or “Translated DNA” algorithm.
In contrast to the Needleman-Wunsch algorithm, the custom algorithms provide correct alignment independent of the size of the sequences indels. These custom algorithms are faster than the Needleman-Wunsch algorithm for large sequences having high homology.
For the Needleman-Wunsch algorithm there are two options: the basic score system or similarity matrices for proteins (default).


The ORF sequences can be aligned using the “Translated DNA” algorithm. For this, set an open reading frame start in the corresponding popup menu (frame 1, 2 or 3). The algorithm translates the ORF sequences, aligns the protein sequences and then, converts back the aligned sequences to DNA sequences.

To align protein sequences using BioLabDonkey1,2 algorithms set the “AA substitutions” checkbox and the minimal identity in corresponding popup menu.For the Needleman-Wunsch algorithm set the basic score system or similarity matrices. Then, press the “align” button.
To change the color scheme of aligned sequences use the corresponding protein or DNA color schemes popup menus.

Realignment
The part of aligned sequences in a block can be realigned using different algorithms. Set the algorithm for the realignment in the corresponding popup menu. Then, select a part of any line in the block and press the “realign” button. The realignment will not be performed if the selection is not a part of a line in one block.

Phylogenetic tree
The sequences similarity can be represented as a phylogenetic tree. Press the “Phylogenetic tree” button to generate and see the phylogenetic tree. The tree type can be set as UPGMA or the custom “Similarity tree”. The tree can be shifted to the left/right side using the “tree shift” slider. The scale of the tree can be changed using the “tree fold” slider. 
The values of pairwise identity distances between the sequence pairs are calculated as following: 
N = ((Ni / Nt1) + (Ni / Nt2)) / 2 
Ni – number of identical aa/bp between two sequences 
Nt1 – total number of aa/bp in sequence 1 
Nt2 – total number of aa/bp in sequence 2 
Distance = 1 / N 

The pairwise identity distances can be recalculated for different identity minimums (set the “minimal identity” popup menu), as well as by taking into account amino acids substitutions (set the “AA substitutions” checkbox). 
UPGMA tree

Similarity tree
The length of a branche between two points (two connected orange circles) represents the identity distance between two sequences. For each connected sequence pairs in the tree the distance between the sequences is the smallest distance (highest identity) in comparison to the distances of these sequences to all other sequences.