Forum:Should bioinformaticians learn the scientific names of plants and animals?
18 months ago

When dealing with dna sequences, one often comes across the scientific names of plants or animals, for instance well-known names like felic catus or utterly unfamiliar like Enterocytozoon hepatopenaei. Should I learn these names or at least the basics so as not to get confused? Or is this just redundant information, which is of little importance of a bioinformatican?

If I am understanding you correctly, you are asking if you need to learn the genus/species of every plant and animal? Good luck with that. Those just fall under:

Kingdom->Phylum->Class->Order->Family->Genus->Species

This is how different organisms are grouped in science and it is called taxonomy. When referring to a specific organism (such as the ones listed above) usually just the genus/species is given as the identifier. If you are working with a particular organism, or group of organisms of interest (such as bacteria or cats or whatever) it may be useful for you to know which genus/species you are working with. This may be particularly useful if you are using a prebuilt reference genome for a specific bacteria or plant, but other than that I don't know where you would use genus/species in analysis.

From a practical point of view, scientific names are the only sure way of unambiguously identifying an organism because vernacular names are too variable. Native speakers of a given language do not necessarily agree on a common name for a given organism.
If you're not a native speaker of English, you'll have to learn the English names for species you work with anyway. For example, as a French speaker, I knew what a danio (name of the zebrafish in French) was but not what a zebrafish was until I saw the scientific name of the zebrafish.

They should know these on a 'need to know' basis.

Since this is more asking for an opinion than a question which has a definite answer I have converted this to a "Forum" post.

4 weeks ago
genomax 68k
United States

While the information is critical it is just a google search away (If you need to confirm something).

Or, if you're on a Mac, Cmd+Space, then type <common_name> scientific name. See:

17 months ago
Washington University in St. Louis, MO

Almost certainly no. Beyond "Homo sapiens" and "Mus Musculus", there are probably only two or three I'd recognize. You'll pick up domain specific knowledge like that by diffusion from whatever project you're on, but a calculated study of them is probably not a good use of your time.

17 months ago
h.mon 25k
Brazil

You probably should know the names of the species you are currently working, and maybe its closest relatives. More important than that, you should know relevant information about your species, like genome size, heterozygosity, repeat content, ploidy, and so on - not that all this information is available on the literature, anyway. This information will help you devise the best analytical strategies, and may prove important troubleshooting when things don't work down the road.

15 months ago
Freiburg, Germany

Is it relevant to your job to immediately recognize some large number of these? If not then the answer is "no". Most of us that studied biology don't know more than a handful of species names, that sort of thing is rarely useful for anyone to know. As a general rule of life, random facts like that aren't useful to memorize unless you want to take part in Jeopardy.

16 months ago
Carambakaracho ♦ 1.2k
Switzerland/Basel

One additional opinion: When dealing with different unicellular species or even metagenomic samples there's often not much more than the scientific name for a species, starting with well known examples like Escherichia coli or Bacillus subtilis, two fairly common bacteria without trivial names, afaik.

However, you will learn the ones you need as you go, after my studies I didn't know more than others mentioned before. Today, due to the position I work in I know dozens.

11 months ago
India

I think it is good to know the scientific names of model organisms or highly studied plant and animals. for example

Plants

• rice (Oryza)
• maize(Zea mays)
• wheat (Triticum)

Animals

• zebra fish (Danio rerio)
• fruit fly (Drosophila)

Remembering those will not harm. These are a handful of those which you will hear or read about frequently on scientific forums, conferences and seminars.

Apart from that, as suggested by fellow biostar users, no need to remember all of them (it is impossible !). Just google on case by case basis.

13 months ago
i.sudbery 4.7k
Sheffield, UK

To add to what others have said, knowing a lot of these is probably not useful, but being able to recognise the common ones will probably be helpful (know what they are when you see them, but neccesarily be able to write them yourself). I doubt anyone "learnt" these deliberately, they just become in grained over time, and everones list is different, but my list would be:

Homo sapiens (human)
Mus musculus (house mouse)
Drosophila melanogaster (fruit fly)
Danio rerio (zebra fish)
Caenorhabditis elegans (roundworm)
Arabidopsis thaliana (thale cress)
Saccharomyces cerevisiae (baker's/brewer's yeast)
Escherichia coli (the common model bacteria)

16 months ago
UK, Hinxton, EMBL-EBI

Whether you want to memorise the names or not, it's a matter of choice, time, effort you are willing to dedicate.

However we should comply with the the standards and recommendations on taxonomy, which in my view does share qualities with any ontology (i.e. hierarchy, standards).

Latin names for species should be in italics.

The genus component of the name should be in upper case e.g. Felis catus is OK, felis catus is not OK.

The species component of the name should be in lower case e.g. Mus musculus is OK, Mus Musculus is not OK.

Depending on the context, these rules could (would) be lifted. In you do code for example, you will not write those in italics.

I don't think there is such a thing as redundant information. The more you work with felic catus the more you will now it is Felis catus. The more you will know it equals cat.

It's important to get the name right. It's UniProt, not Uniprot or uniprot. It may sound pedantic. It may be pendantic. But we need to use the right name. The wrong piece of code may make a difference, give you an error or give you wrong output. So details are important.

I look at information as relevant information. Some is very relevant information, others not so much so.

Note, we say mm10 for the genome assembly for mouse, because mm = Mus musculus.