Skip to content

miRBase 19 released

miRBase 19 is now available, brought to you from the Benasque RNA meeting in the sunny Pyrenees, and with a slightly larger time gap than usual. In that extended time, we have added more than the usual number of new sequences — 3171 new hairpins and 3625 novel mature products, bringing the totals to 21264 and 25141 respectively in 193 species. As always, the full README file is available on the FTP site, along with downloadable files containing all data in various formats.

We have spent some time deleting misannotated sequences, and the deep sequencing read views will allow us to focus more on this — 133 entries are removed in this release, many from the rice miRNA complement. We have also cleaned-up a number of cases of duplicate entries mapping to a single genomic locus (some prompted by new genome assembly releases) and rationalised many miRNA names. This is therefore a good time to remind you that the names are meant to be useful, but are not formally stable, and shouldn’t be used to convey complex information. The miRNA accession numbers *do* remain stable between releases, and of course, you can always quote the sequence to be truly unambiguous.

In this release, the miR* nomenclature is finally retired for all species, as previously promised. For every hairpin and mature sequence, all IDs that have previously been used in miRBase are now visible on the entry pages, and are downloadable in bulk from the FTP site.

At the time of writing, we have not added new deep sequencing datasets to the read view pages — however, a decent sized update to that section will be coming along shortly, together with an announcement here.

As always, comments, questions, abuse, praise all welcome here or by email.

Posted in data update, releases.

18 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. Nyabi Omar says

    Very good job

  2. strelok says

    A possible bug in “mature.fa”?

    After downloading this file from server, i found the following 2 entries to be suspicious to contain some error:

    >hsa-miR-642a-5p MIMAT0003312 MI0003657 Homo miR-642a-5p
    >hsa-miR-642a-3p MIMAT0020924 MI0003657 Homo miR-642a-3p

    Compared to other human miRNA entries in the same file, the 3rd column should be “Homo”, and the 4th colume should be “sapiens”.

    Please see to this problem, thank you!

    • ana says

      Thanks for your comment, yes, you are right, we will fix this shortly.


      • sam says

        Thanks for this. Dumb typos fixed in FTP site files now.

  3. Majid says

    Thanks for this great job. How can I easily find the number of mature human miRs in the new version?

  4. sam says

    Browse page is easiest:

  5. Jen says


    Great work, such a big job! I was cruising around some of the new miRNAs in mouse and saw that some had sequence reads associated on the main page but no details in the link (e.g. mmu-let-7j). Is this still being updated?

    • ana says

      Hi Jen.
      There are no reads mapping with 0 mismatches to mmu-let-7j precursor sequence. If you go down page and select 1 mismatches with untemplated ends you will
      see all the reads that map to that hairpin.

      Hope this helps,

  6. Yuanji says

    It is useful to partition data into different lineages of animals, plants and viruses. From the name format I can tell by pattern match whether a miRNA is from plant or non-plant. How to separate animal ones from virus’ miRNAs? Can anyone help? Thanks.

    • sam says

      There’s a list of organisms on the FTP site:

      … with the format:

      hsa	HSA	Homo sapiens 		Metazoa;Bilateria;Deuterostoma;Chordata;Vertebrata;Mammalia;Primates;Hominidae;
      ath	ATH	Arabidopsis thaliana	Viridiplantae;Magnoliophyta;eudicotyledons;Brassicaceae;
      hcmv	VRL	Human cytomegalovirus	Viruses;

      Hopefully this should allow you to parse out the organism codes for any subset of species.

  7. Yi-nan says

    How can I tell which miRNA is more reliable, like has been validated by RT-q-PCR, and which one has only got RNA-Seq support?

    • sam says

      Each mature miRNA in the database has a line that says what it’s evidence is. In the miRNA.dat file on the FTP site, this looks like this:

      FT miRNA 11..32
      FT /accession="MIMAT0004513"
      FT /product="hsa-miR-101-5p"
      FT /evidence=experimental
      FT /experiment="cloned [3-4]"

      So that says that hsa-miR-101-5p was identified by small RNA cloning in refs 3 and 4 of that entry. You can see this on the website here:

      These evidence codes are not entirely comprehensive — by that I mean that we are not systematically scouring the literature to fill these in; rather these are the papers whose authors have submitted the data to us. That means that every entry is at least tagged with the method from its first identification.

  8. Laura Klitten says

    Hello – thanks for a great database. Can you ellaborate a bit more on the reasoning for keeping the same accession numbers (MIMAT numbers) when a new miRBase version is released? For practical purposes it is not convenient to refer to the sequence. Would you consider to make an extension to the MIMAT number that would refer to the miRBase reference version?

    • sam says

      Hi Laura

      The accession numbers are the only truly stable entity between releases, so it’s important that they remain the same. I think what you’re referring to is our lack of a sequence versioning system, which is on our list to fix ASAP. The norm is to use accession.version as a unique identifier for a sequence, e.g. MIMAT0020301.3. At present you can refer uniquely to a sequence version with the accession and the miRBase release number, e.g. MIMAT0020301 (v19), or similar.


      • Laura Klitten says

        Hi Sam
        Thanks a lot for your positive answer!
        Is it also possible to see the reference sequences from the older versions of miRBase? It would be great if I could just click on my miR of interest and then get an overview of the change in reference sequence over time.

        • sam says

          You can’t currently see this through the website, but you can download all previous versions of the database from the FTP site: Along with formal versioning, a better view of previous sequence versions for the website is on our list already.

  9. zaynab says

    hiii, thank you for every thing, i have a question, if i want to take a sequence which one i take, the stem loop or the mature one?

    • sam says

      Sorry — I don’t understand. Whether you want the mature or stem-loop sequence depends on what you are trying to do. For example, the stem-loop sequence is better for homology searches, structure prediction etc, whereas you want the mature sequence if you’re predicting targets.

Some HTML is OK

or, reply to this post via trackback.