After a little more pain than usual, miRBase 18 is finally released. The database contains 18226 entries representing hairpin precursor miRNAs, expressing 21643 mature miRNA products, in 168 species. That represents 1488 new hairpin sequences and 1929 novel mature products. The full README file is available on the FTP site.
As previously discussed, we have continued to rename mature sequences, phasing out the miR/miR* nomenclature in favour of the -5p/-3p nomenclature. That affects approximately 1400 mature sequences this time, from human, mouse and C. elegans. (We had planned to do rat as well, but decided to hold off until we had incorporated more rat deep sequencing data.)
There are also significant changes to the zebrafish miRNA complement, rationalising the entries with respect to the (now not so new) Zv9 genome assembly. That has lead to the deletion of 26 zebrafish entries, and the creation of 12 entries that represent duplicate loci. The full list of changes are itemized in the miRNA.diff file on the FTP site.
The website also shares new deep sequencing data — now approaching 250 datasets from NCBI GEO. In addition to raw read counts, we also show normalized read counts, currently calculated as reads per thousand reads that map to miRNAs (designated RPT on the website). We have also implemented a new feature to allow the comparison of normalized read counts from multiple experiments. For example, from the list of all D. melanogaster datasets (accessible from the “by tissue expression” box on the search page), you can tick up to 5 different experiments to compare read counts. This is getting dangerously close to allowing some really complex and powerful analyses through the website! You can also download the read counts from the results page, for offline processing. This is all Ana Kozomara’s work. As with all new features, it is wise to consider this to be in beta. We’ll be very happy to get your comments, bugs, praise, and abuse, as usual, here or by email.
Hi,
Can you please check the hsa-miR-3190 entry? It looks like the “old” hsa-miR-3190 should have been renamed hsa-miR-3190-3p instead of changing the sequence for MIMAT0015073.
Thanks,
Peter
I see — I think you’re right. The accession MIMAT0015073 was previously attached to the 3′ arm sequence, but is now the 5′ sequence. This actually goes back to an earlier error: in release 15, this accession pointed to a sequence called hsa-miR-3190-5p, which was on the 3′ arm! I can swap the accessions of 3p and 5p sequences back, but I’m not sure that’s less confusing. Other options are leave as is, or generate a new accession for the 5p sequence.
can anyone tell me the total number of miRNA reported in plants till date and from how many species?
The browse page is the place to get an overview of this. There are currently 4053 hairpin entries from 52 plant species in the database.
Can anyone tell me the total number of mature miRNAs in human update? There are currenttly 1527 sequences( stem-loop and mature )in the browse page. I will know only mature miRNAs. Thank you
There are 1921 distinct human miRNAs in the database.
would you please tell me the total number of mi RNAs in human, mouse, rat
You can see this on the browse page. Click “expand all” to see all species, or the links at the top of the page to jump to specifics.
Click
There are four human stemloop sequences (and therefore associated mature miRNAs) that do not appear to have genome co-ordinates, and thus cannot be mapped and visualised (accession numbers listed below):
MI0017872
MI0016839
MI0016059
MI0005764
Please could I get some help / advice? Thanks!
Hi Sam,
I was wondering if I could get some advice on the above problem?
Regards.
Hi Andrew
It looks like mir-941-2 and mir-4482-2 map to identical positions in the genome as the respective -1 sequences. Those entries should therefore be merged in the next release. mir-1273e doesn’t look like a real miRNA from the available deep sequencing datasets, so is also likely to go away. mir-3155b maps just fine, to chr10:6194170-6194225[-], and should be fixed in the next release.
Sam
Hi,
Could you please check the hsa-miR-365 entry?
I remember in release 17, both precursors generate the mature sequence of miR-365 (which is the same sequence with the current miR-365a-3p and miR-365b-3p) while only the mir-365-2 generate miR-365* ( MIMAT0009199, AGGGACUUUCAGGGGCAGCUGU). Now in release 18, the same accession number (MIMAT0009199) points to miR-365a-5p with a different sequence and labeled as “Previous ID: miR-365*”. And miR-365b-5p actually has the same sequence with the previous miR-365*.
Thanks,
Ji
Hi,
I was going through mirbase V18 database_files for Human miRNAs. I downloaded 3 files i.e. mirna.txt, mirna_mature.txt and mirna_pre_mature.txt. I found 1527 hsa precursors in mirna.txt file, 2154 hsa mature in mirna_mature.txt while I got 2110 unique hsa mature mirna from mirna_pre_mature.txt.
Can anyone explain why each human mature mirna is not linked to corresponding precursor?
The mirna_pre_mature file does link each mature sequence to it’s hairpin. However, the links are not 1:1 — each hairpin may contain 2 mature sequences, and each mature sequence could be present in >1 hairpin. The mirna_mature file is non-redundant for boring reasons to do with running the website (and my general stupidity), so some miRNAs have more than one entry in that file.
Hi, mind i ask if a new release (v19) is imminent? I noticed that v17 is released on Apr 26 last year. Just because i am doing design job of a full-spectrum hsa miRNA microarray, it would be nice to be informed if a new version is coming this month… a simple answer will save a lot time and work. Many thanks in advance!
We’re provisionally planning a release towards the end of May. The timing may slip a little. We’re also working towards a way to publish and stick to a release schedule — I’ve promised this to various people over the years, and we’ll get there eventually!
Hi,
In hsa.gff annotation file for human, there are 1523 miRNAs in total, both sense and antisense included, right? But how do you discriminate miR/miR* in nomenclature? Because there is no -5p/-3p nomenclature in this .gff file.
Hi.
On the miRBase ftp site, in the genomes folder, we added gff3 files containing both precursor and mature sequences. I hope this answers your question.
Is there a NCBI36 version of gff coordinates for the miRBase 18 human miRNA precursor?
I’m afraid not — we try to map everything to the latest version of assemblies. The liftover tool at UCSC provides a method to transfer genome coordinates between assemblies:
http://genome.ucsc.edu/cgi-bin/hgLiftOver
Galaxy provides methods to convert between file formats, including GFF and BED — see the “convert formats” tools here:
http://main.g2.bx.psu.edu/
I want to download whole hsa miRNs sequence at once in notepad format . How can I do that please help me.
Notepad isn’t a format, so I guess you mean plain text format. All miRNA sequences are available in plain text format on the FTP site, zipped here:
ftp://mirbase.org/pub/mirbase/CURRENT/mature.fa.zip
ftp://mirbase.org/pub/mirbase/CURRENT/hairpin.fa.zip
You can also use the browse page:
http://www.mirbase.org/cgi-bin/browse.pl?org=hsa
Click “Homo sapiens”, then click “select all” at the bottom of the page, then “fetch”.