Help

Home
Search
Analysis
- Blast
- Clustal Omega
- Motif
- Translation
Submit
Structure
Links
Help
About Us

Introduction
Search
Analysis

Blast
Clustal Omega
Motif
Translation

Submit
Structure

Links

iCAN: Institute Collection and Analysis of Nanobodies

User's manual

Introduction

Nanobodies are single-domain antibodies derived from the variable regions of camelidae atypical immunoglobulins (Igs). They are highly valued as high-affinity reagents for research, diagnostics and therapeutics owing to their high specificity, small size (~15 kda) and straightforward bacterial expression. Nanobodies are now being studied for use in various disease areas, including oncology, infectious, inflammatory, and neurodegenerative diseases. It is generally recognized that nanobodies have extensive application prospects in diagnosis and therapy in the future.

iCAN has been created with an objective to prospect the academic research and clinical application of nanobodies. To our knowledge, it is the first comprehensive database of nanobody. This manually curated database currently holds 2490 nanobody sequences including 107 nanobodies from RCSB PDB, and 2226 nanobodies from patents. Information related to nanobody DNA sequence, protein sequence, structure, target antigens, function, taxonomy of the source organism, links to external databases like PDB and EMBL are provided. Frequently used tools such as Blast and Clustal Omega are included here. The website also provides functions of sequence upload and analysis. The database will be updated monthly with additional nanobodies.

Search

We classified search into basic search and advanced search in iCAN. Basic search allows users to search based on keywords like nanobody name, antigen, PDB ID, function, PubMed ID, source organism, etc. Advanced search allows users to restrict the search to a combination of varied feature description.

Both searches are case insensitive. A complete list of the field descriptors and their description is given below:

DESCRIPTORS	DESCRIPTION
Antigen	The name of an antigen, e.g. GFP
PDB ID	PDB entry name. e.g. 2X1O
Name	The name of nanobodies in iCAN, e.g. CAN_002
PubMed ID	PubMed entry number, e.g.23911607
Function	The function of nanobodies, e.g. Food testing
Source organism	The animal source of the nanobodies, e.g.Lama glama
Bacteria family	The bacteria family for expression of nanobodies, such as E.coli TG1

Analysis

The analysis interface provides four frequently-used tools for sequence analysis.

BLAST

BLAST, namely Basic Local Alignment Search Tool, is a sequence comparative tool, which is used to find local similar regions between sequences. Users can use it to compare protein or nucleotide sequences to chosen sequence databases and obtain the statistical results of matches that can help users judge the confidence of the alignment. This search tool allows scientists to infer the function of a sequence referring to similar sequences. It also can be used to infer evolutionary relationships between sequences and help identify family members.

BLAST in iCAN allows users to choose databases of interest such as the entire database and all the patented items.

And we supply a link to NCBI BLASTP if you want to blast full datasets in NCBI.

How to use this tool?

Step-1 Enter Query Sequence

Users should enter query sequence in FASTA format directly into the input box.

Step-2 Set parameters

Default parameter choices are set for the intended uses of the tools. Users can adjust them according to their need.

E-value

The Expect value (E) is a parameter that describes the number of hits one can "expect" to see by chance when searching a database of a particular size. The lower the E-value, or the closer it is to zero, the more "significant" the match is. When the Expect value is increased from the default value of 10, a larger list with more low-scoring hits can be reported.

Default value is: 10

Alignment

Choose the alignment pattern, gapped or ungapped.

Default value is: ungapped

Matrix

The "substitution matrix" is a key element in evaluating the quality of a pairwise sequence alignment, which assigns a score for aligning any possible pair of residues. Users can select the scoring matrix according to the feature of sequences and their need.

Default value is: BLOSUM45

Databases

Users can choose the comparative databases of interest.

Default value is: ALL

Alignment View

Choose the alignment view, pairwise or multiple.

Default value is: pairwise

Step-3 Submission

References

Gibrat JF, Madej T, Bryant SH. Surprising similarities in structure comparison. Curr Opin Struct Biol. 1996 Jun; 6(3): 377-85.

FASTA format

FASTA format for sequences begins with a single-line description, followed by lines of sequence data. The description line is demarked from the sequence data by a greater-than ('>') symbol in the first column.

For example:

>ENA|AJ238057|AJ238057.1 Lama glama partial mRNA for immunoglobulin heavy chain variable region (IGHV gene), clone WH25

CTGCAGGAGTCAGGGGGAGGCTTGGTGCAGCCTGGGGGGTCTCTGAAACTCTCCTGTGCG

Clustal Omega

Clustal Omega is a tool for multiple sequence alignment. It is the latest addition to the Clustal family. It can align hundreds of thousands of sequences quickly and deliver accurate alignments because of the new HMM alignment engine. Users can paste their sequences in the FASTA format. After alignment, Tool users can see evolutionary relationships via viewing Cladograms or Phylograms which are beneficial for discovering and designing novel nanobody sequence.

How to use this tool?

Step 1 - Sequence

The first step is to set the tool input. Users can input sequences directly or upload sequence files.

Sequence Input Window & Sequence File Upload

Users can directly enter three or more sequences to be aligned into the input box. Sequences should be in FASTA format. A return should be added to the end of the sequence to help certain applications understand the input. Note that Word processor files or data from Word processor may lead to unpredictable results as hidden/control characters may be present in the files. There is currently a limit of 2000 sequences and 2MB of data.

Step-2 Set parameters

Default parameter choices are set for the intended uses of the tools, and can be adjusted by the tool user.

Dealign Input Sequences

Remove any existing alignment (gaps) from input sequences.

Option	Description	Abbreviation
no		false
yes		true

Default value is: no [false]

Output Alignment Format

Format for generated multiple sequence alignment.

Option	Description	Suffix
CLUSTAL	Clustal alignment format without base/residue numbering	clu
MSF	Multiple Sequence File (MSF) alignment format	msf
PHYLIP	PHYLIP interleaved alignment format	phy
SELEX	SELEX alignment format	selex
STOCKHOLM	STOCKHOLM alignment format	st
VIENNA	VIENNA alignment format	vie

Default value is: CLUSTAL [clu]

For this "clu" format, a download button is provided for downloading the file of the alignment sequences which is converted to FASTA format. This fasta file can be used as input file for motif analysis.

Step-3 Submission

References

Sievers F., Wilm A., Dineen D., Gibson T.J., Karplus K., Li W., Lopez R., McWilliam H., Remmert M., Söding J., Thompson J.D., Higgins D.G. (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol. 2011 Oct 11;7:539. doi: 10.1038/msb.2011.75.
Goujon M., McWilliam H., Li W., Valentin F., Squizzato S., Paern J., Lopez R. (2010) A new bioinformatics analysis tools framework at EMBL-EBI. Nucleic Acids Res. 2010 Jul;38 (Web Server issue):W695-9. doi: 10.1093/nar/gkq313. Epub 2010 May 3.
McWilliam H., Li W., Uludag M., Squizzato S., Park Y.M., Buso N., Cowley A.P., Lopez R.(2013) Analysis Tool Web Services from the EMBL-EBI. Nucleic Acids Res. 2013 Jul;41 (Web Server issue):W597-600. doi: 10.1093/nar/gkt376. Epub 2013 May 13.

Motif

"Motif" is a short conserved region in a protein sequence. This tool graphically represents amino acids or nucleic acids multiple sequence alignment. Each chart consists of stacks of symbols, of which one stack represents one position in the sequence. The sequence conservation at each position can be seen from the overall height of each stack, while the relative frequency of each amino or nucleic acid at that position is indicated by the height of symbols within the stack. The width of the stack is proportional to the fraction of valid symbols in that position. (Positions with many gaps have thin stacks.) The stacks display colors are chosen according to the chemical species they represent. The default colors for nucleotides are G, orange; T and U, red; C, blue; and A, green. Amino acids have colors according to their chemical properties, that is to say, polar amino acids (G, S, T, Y, C, Q, N) show as green, basic (K, R, H) blue, acidic (D, E) red, and hydrophobic (A,V, L, I, P, W, F, M) amino acids as black.

The Motif tool can be used to discover sequence feature for a given group of nanobodies of interest, by which users can find the functional domain and design novel nanobody sequence.

References

Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994.
Crooks GE, Hon G, Chandonia JM, Brenner SE WebLogo: A sequence logo generator, Genome Research, 14:1188-1190, (2004)
Schneider TD, Stephens RM. 1990. Sequence Logos: A New Way to Display Consensus Sequences. Nucleic Acids Res. 18:6097-6100

Translation

Translation tool allows the user to translate a nucleotide (DNA/RNA) sequence to a protein sequence. Users can enter a DNA or RNA sequence in the input box. The result will show 3 kinds of translated sequences from different open reading frames. At the beginning of each line of the sequence, there is a number showing the order of the first acid amino.

Submit

Users can submit their own nanobody data to store and analyze their nanobody. Users should submit their sequence in FASTA format and complete the required information such as users' contact information and antigen name. We will review the sequence, give an annotation for submitted sequence and return the result to the users.

Structure

The structure interface shows the nanobodies whose structures are available in PDB. Users can further obtain some structural information for their research through the links to PDB.

Links

The links interface shows some links to other related databases.

iCAN: Institute Collection and Analysis of Nanobodies

User's manual

Introduction

Search

Analysis

Submit

Structure

Links

If you have any questions, you are welcome to contact with us. Thanks for your support!