Skip to content

miRBase 20 is coming

miRBase 20 is long overdue, but should finally make an appearance within the next week. As you might expect, the extended period since the last release means many new entries — over 3000 new stem-loop sequences, and over 5000 new mature sequences. These additions mostly expand the miRNA sets of species already in the database, rather than adding new species. More soon.

Posted in Uncategorized.

Website at risk, Tues 19th March 8am-9am GMT

The miRBase website may be intermittently inaccessible from 8am-9am GMT on Tuesday 19th March, and all day on Saturday 23rd March, while some network and electrical maintenance is carried out. Apologies for any inconvenience.

Posted in Uncategorized.

miRBase web site down time, Oct 22nd-23rd

Essential network and electrical work in our server room work means that the web site is at risk of intermittent down time on Monday 22nd and Tuesday 23rd October. Apologies for any inconvenience.

Posted in down time.

miRBase 19 released

miRBase 19 is now available, brought to you from the Benasque RNA meeting in the sunny Pyrenees, and with a slightly larger time gap than usual. In that extended time, we have added more than the usual number of new sequences — 3171 new hairpins and 3625 novel mature products, bringing the totals to 21264 and 25141 respectively in 193 species. As always, the full README file is available on the FTP site, along with downloadable files containing all data in various formats.

We have spent some time deleting misannotated sequences, and the deep sequencing read views will allow us to focus more on this — 133 entries are removed in this release, many from the rice miRNA complement. We have also cleaned-up a number of cases of duplicate entries mapping to a single genomic locus (some prompted by new genome assembly releases) and rationalised many miRNA names. This is therefore a good time to remind you that the names are meant to be useful, but are not formally stable, and shouldn’t be used to convey complex information. The miRNA accession numbers *do* remain stable between releases, and of course, you can always quote the sequence to be truly unambiguous.

In this release, the miR* nomenclature is finally retired for all species, as previously promised. For every hairpin and mature sequence, all IDs that have previously been used in miRBase are now visible on the entry pages, and are downloadable in bulk from the FTP site.

At the time of writing, we have not added new deep sequencing datasets to the read view pages — however, a decent sized update to that section will be coming along shortly, together with an announcement here.

As always, comments, questions, abuse, praise all welcome here or by email.

Posted in data update, releases.

miRBase 19 is coming …

We’re scrambling to release miRBase 19 in the next few days from the Benasque RNA meeting in middle of the sunny Pyrenees. We have over 3000 new sequences, including the first entries for 25 new species (mostly plants). We’ve also put some effort into cleaning up some old entries, deleting over 130 misannotated sequences. More soon.

Posted in releases.

Missing comments

A corrupt database has led to the loss of any blog comments left in the past 10 days or so. Please feel free to email or re-post. Apologies for any inconvenience.

Posted in Uncategorized.

miRBase, Wikipedia and community annotation

Many miRBase entry pages have a new “community annotation” section (see, for example, dme-mir-10). This section incorporates information about specific microRNA families and sequences taken directly from the free, online encyclopedia, Wikipedia. In total, over 4500 miRBase entries currently include information from Wikipedia. We show the summary paragraph from the Wikipedia page, the full page, and a link to edit the page in Wikipedia. Any edits will appear in Wikipedia immediately, and in miRBase within 24 hours.

There is already a large amount of information in Wikipedia about specific microRNA sequences and families. We hope that distributing this information in miRBase, and providing links to edit the pages, will encourage miRBase users and microRNA experts to contribute their knowledge in the form of Wikipedia edits and new pages. Textual annotation of microRNAs in miRBase is therefore now firmly in the hands of the microRNA community.

Anyone can edit a Wikipedia page, and editing a page is straightforward. However, Wikipedia has strict policies and guidelines about how to edit and create pages. Adhering to these guidelines makes it much more likely that your contributions will survive. The following help pages on the Wikipedia site provide detailed information about how to keep Wikipedians happy:

The most important thing to remember is that information you add should be substantiated, preferably with literature citations. You’ll see that lots of existing Wikipedia microRNA pages have fairly minimal information. We’re compiling a list of microRNA pages that are in need of some attention, here. Please take a look, and consider adding some information. Pages such as the mir-10 entry, and the mir-2 family page provide excellent models for what makes a great microRNA page.

You can also create new pages at Wikipedia about microRNA sequences and families that have miRBase entries, but don’t currently have Wikipedia entries. Please let us know if you do this, so we can incorporate your annotation into miRBase, and create the appropriate links from miRBase entries to the relevant Wikipedia pages. The most important thing to remember if you’re considering making a new Wikipedia page about a microRNA is that your contribution should be “notable”. A microRNA of completely unknown function is unlikely to be worthy of a Wikipedia page. However, if you’ve just published a paper that describes the evolution of the mir-277646 family, and its function as a core regulator of the cell cycle, then a Wikipedia page is certainly deserved.

Let us know what you think, here or by email to the usual address.

Now go edit!

This effort is building on that of the Rfam database of RNA families, which has paved the way in incorporating RNA information (and biological annotation more generally) into Wikipedia, led by Alex Bateman with all the real work done by Jen Daub, John Tate and Paul Gardner. We are extremely grateful to them for allowing us to steal code and lists of relevant Wikipedia pages.

The following sources provide detailed information about the Rfam/Wikipedia alliance, and its success:

Daub J, Gardner PP, Tate J, Ramsköld D, Manske M, Scott WG, Weinberg Z, Griffiths-Jones S, Bateman A. The RNA WikiProject: community annotation of RNA families. RNA. 2008 14(12):2462-2464.

Logan DW, Sandal M, Gardner PP, Manske M, Bateman A. Ten simple rules for editing Wikipedia. PLoS Comput Biol. 2010 6(9):e1000941.

Bateman A, Logan DW. Time to underpin Wikipedia wisdom. Nature. 2010 468(7325):765.

Posted in community annotation, new features.

MicroRNA Wikipedia pages in need of attention

The following Wikipedia pages about microRNA sequences and families could do with some loving care. Please take a look, and consider adding information about microRNA function, evolution, discovery, and references. Feel free to comment here, or email us at the usual address, if you make changes worthy of removing pages from this list.


mir-92_microRNA_precursor_family (Intro section is out-of-date and needs a re-write)

Posted in community annotation, new features.

miRBase website “at risk”, Thu 10th to Fri 18th Nov

Due to server room refurbishment, the miRBase website may experience some instability between Thu 10th and Fri 18th November 2011. The plan is for just 30 minutes or so down time at either end of that period, but the website should be considered “at risk” throughout. Apologies for any inconvenience.

Posted in down time.

miRBase 18 released

After a little more pain than usual, miRBase 18 is finally released. The database contains 18226 entries representing hairpin precursor miRNAs, expressing 21643 mature miRNA products, in 168 species. That represents 1488 new hairpin sequences and 1929 novel mature products. The full README file is available on the FTP site.

As previously discussed, we have continued to rename mature sequences, phasing out the miR/miR* nomenclature in favour of the -5p/-3p nomenclature. That affects approximately 1400 mature sequences this time, from human, mouse and C. elegans. (We had planned to do rat as well, but decided to hold off until we had incorporated more rat deep sequencing data.)

There are also significant changes to the zebrafish miRNA complement, rationalising the entries with respect to the (now not so new) Zv9 genome assembly. That has lead to the deletion of 26 zebrafish entries, and the creation of 12 entries that represent duplicate loci. The full list of changes are itemized in the miRNA.diff file on the FTP site.

The website also shares new deep sequencing data — now approaching 250 datasets from NCBI GEO. In addition to raw read counts, we also show normalized read counts, currently calculated as reads per thousand reads that map to miRNAs (designated RPT on the website). We have also implemented a new feature to allow the comparison of normalized read counts from multiple experiments. For example, from the list of all D. melanogaster datasets (accessible from the “by tissue expression” box on the search page), you can tick up to 5 different experiments to compare read counts. This is getting dangerously close to allowing some really complex and powerful analyses through the website! You can also download the read counts from the results page, for offline processing. This is all Ana Kozomara’s work. As with all new features, it is wise to consider this to be in beta. We’ll be very happy to get your comments, bugs, praise, and abuse, as usual, here or by email.

Posted in data update, new features, releases.