Rules of scientific English writing for an international audience.
Although English is the common language for international scientific communication, most scientists are not native English speakers. To account for this, I think that all scientists (especially native English speakers) should try to write text which is easy to read for non-native speakers. I propose the following rules for this:
- ABSOLUTELY NO CULTURAL REFERENCES. Americans: pay extra attention to this! (link)
- Use simple words. Repeated words are ok. (link)
- No slang, especially current slang. (link)
- Use literal language instead of metaphorical language. (link)
- Use simple sentence structures. In particular, minimize nested clauses. (link)
- Avoid Latin/French/German phrases. (link)
- Minimize the use of idioms/fixed phrases. If you really want to use an idiom, use quotation marks. (link)
Explanations and examples
No cultural references
More precisely my rule is "no cultural references which will not be universally understood". In practice however, I think very few cultural references actually are universally understood (as I will elaborate on below). Therefore, I think "no cultural references" is generally a sufficient description. If you really want to include a cultural reference, at least include a footnote which explains it.
Avoiding cultural references is hard
When I hear "avoid cultural references" I think "reference a specific book/film/TV series that not everybody has read/watched." For example, if I were to describe a procedure with excessively strict rules as "soup-Nazi-esque", readers who have not watched a particular episode of the TV series Seinfeld would not understand it. This is easy to imagine and therefore easy to avoid (especially since I know many people my own age and with a similar background who have not watched this episode of Seinfeld).
This kind of reference is atypically easy to avoid. References which may be less obvious are:
- "Famous" people: who is famous varies enormously between places and between generations. People at my parent's age may not know about Billie Eilish, and young people might not know about The Rolling Stones. Even "tier 1" American celebrities like Tom Cruise or Taylor Swift might not be known to, for example, a 60 year old Chinese scientist.
- Allusions to "classics": it may feel safe to assume that people are vaguely familiar with "classics", at least their names/themes. Unfortunately, what is considered a "classic" varies widely by region. People in non-Western countries may not have watched Star Wars or old Disney films. Non-anglophones may not have heard of Charles Dickens. Characters from Greek mythology (eg Zeus) are probably not well-known outside Europe. Even characters from the bible may not be universally known (eg in largely atheist countries like China).
- Historical figures: similar to classics, not everybody will know about George Washington, Queen Victoria, Augustus Caesar, or Hammurabi.
- Cuisine: phrases like "the meat of the argument" reflect European cooking practices of a meat-centric meal (not shared in places like largely-vegetarian India).
- Stores/brands: these vary a lot in different places! Don't assume readers will know "Rolex is a luxury watch" or "Ferrari is a fast, expensive, and high-quality car".
Americans: be extra careful
In my experience, Americans have a particularly strong tendency to overestimate foreigners' familiarity with American culture. This is somewhat understandable because American culture is globally dominant and many foreign scientists are familiar with many popular items from American culture such as McDonald's, rap, jazz, the TV series Friends, etc. However, this familiarity should not be extrapolated too far. Some examples:
- While the most popular US states might be known (eg California, Florida, Texas), less prominent states might not be (eg Montana, Idaho). Don't assume that people will know where these states are (even roughly) or their stereotypes (eg Nebraska has lots of farms).
- Most of the world does not learn US history (including the civil war).
- People may not have heard of smaller US cities like St Louis or Denver.
If you are American and A) don't have family who live abroad B) have not lived abroad yourself, you should probably assume that you cannot judge what foreigners will know and don't know, and therefore I suggest you default to assuming nothing.
Is there really nothing universal?
Some things are probably safe. For example, essentially every country has McDonald's. Probably everybody has heard of Obama, the Queen of England (Elizabeth II), McDonald's, and Coca Cola. But even when suggesting these I have lingering doubts.
Exception: the culture of science itself
At least when writing for an audience of scientists, I propose that the following references are acceptable:
- Famous scientists like Einstein and Newton. Most likely all scientists in all fields will know at least these two names.
- Cultural trends within science like "open data" and "the reproducibility crisis"
- Events like the development of the atomic bomb, discovery of DNA, etc.
Heuristic basket of countries
If you are trying to assess whether a specific cultural reference will be understood by an international audience, I suggest explicitly considering the following 5 countries:
- USA
- Germany (representative European country)
- India (huge country with unique culture)
- China (also huge country with a unique culture, and also a lot of censorship of Western books/films)
- Japan (Westernized rich country with very non-Western history and culture)
This list is, of course, not representative of the world population (eg lack of any African, Middle Eastern, or South American countries), but this lack of representation is (unfortunately) also reflected in the population of world scientists (at least in my experience). At the very least, the inclusion of 3 non-Western countries will help avoid extremely Euro-centric references.
Simple words
Non-native speakers tend have a smaller vocabulary. Using simpler words helps readers avoid needing to interrupt their reading to consult a dictionary. Some examples of simplified words:
- Multitude/numerous/plethora/innumerable/countless => many/a lot
- Utilize/employ/make use of/operate => use
- Obtain/acquire/procure/secure => get
Potential objection 1: simplifying words loses meaning. For example, saying "there are many stars in the sky" instead of "there are innumerable stars in the sky" removes the implication that the number of stars is so large that one cannot intuitively reason about the number. A reader could potentially interpret "many" stars as 100, while "innumerable" would not be interpreted that way. My counter-argument: usually these subtle distinctions are not important, and if they are important then they should be stated explicitly. For example, one could say "there are many stars in the sky, at least one trillion."
Potential objection 2: I was explicitly taught in high-school English class that using the same word too many times makes text hard to read. For example, "the sky contains many stars, many black holes, many pulsars, and many galaxies" could be changed to "the sky contains many stars, a multitude of black holes, countless pulsars and innumerable galaxies." My counter-argument: the goal of scientific writing is clear communication, not enjoyment. Although I personally enjoy reading the second version of the sentence more, I am a native English speaker. I probably would have enjoyed the sentence much less if I needed to open a dictionary to understand it. Therefore, I think this high-school English rule should explicitly be ignored in scientific writing.
No slang
Most people try to avoid slang already, but it is easy to accidentally include slang words that you use in everyday speech, eg "cool" or "rough" (like "rough time"). Always double check your text.
Literal language over metaphorical language
Metaphorical language is very common in English, and is often considered a key part of "educated" writing. However, readers who are unfamiliar with a particular metaphor are likely to interpret it literally before realizing that it is a metaphor. Therefore, I suggest using literal language instead of metaphors whenever possible.
Here are some examples:
- "Flowery language": unless the text is actually about flowers, this is a metaphor. I suggest "fancy language".
- "Warm greeting": unless the greeting had a high temperature, "friendly greeting" is more literal.
- "Hot topic": an abstract concept such as a topic does not have a temperature. "Popular topic" is more literal.
Potential objection: these uses of words are all legitimate. The metaphorical meanings of "flowery", "warm", and "hot" all appear in the dictionary. My counter-argument 1: while they do appear in the dictionary, they are not the first entry. For example, the metaphorical usage of the word "hot" above was entry #4 in the online dictionary I used. It is not fair to the reader to make them read 4 dictionary entries. My counter-argument 2: these listings appear in the dictionary because the metaphors are common, not because it is a "legitimate" usage of the word.
Simple sentence structures
Use simple sentence structures which are easy to read. I think "nested clauses" in particular cause a lot of difficulties and deserve special attention. An example of a sentence with a nested clause is:
The teacher who has students that study hard will see their students succeed.
Native speakers will fairly easily parse this as:
The teacher (who has students [that study hard]) will see their students succeed.
However, non-native speakers may find this more difficult, and even native speakers may find it hard to read this sentence quickly. Instead, write something like:
The teacher will see their students succeed if their students study hard.
This avoids the nested clause.
Avoid Latin/French/German phrases
Some Latin/French/German phrases effectively function as advanced English vocabulary, for example:
- lingua franca: common language
- ad infinitum: repeated infinitely
- carte blanche: complete freedom to do what you think is best
- coup de grâce: final action which ends a bad situation
- ansatz: initial guess
Remember that the contemporary scientific community is not Europe in 1800. International scientists will be highly educated, but this education will likely not include French/German/Latin (unless they are in France or Germany). Using these words makes your text less international, not more. In almost all cases the English translations are preferable.
NOTE: there are some notable exceptions. I think the abbreviations i.e. and e.g. are common enough that most people know them. Some foreign words are part of the established nomenclature in some fields (like ansatz in physics and coup d'état in history).
Minimize idioms
Idioms (sometimes called "expressions", "set phrases", "sayings", etc) are effectively difficult words which non-native speakers may not know. However, unlike difficult words, idioms are harder for readers to understand because:
- They may not appear in a dictionary.
- It might be difficult to know which words in a sentence are included in the idiom.
Therefore, I suggest avoiding idioms as much as possible.
If you ignore this advice and use an idiom anyway, I suggest surrounding it with quotation marks. Although this may look ugly, it clearly signals to the reader that the words within the quotations should be parsed together and can be copy/pasted into a search engine.
Native speakers beware!
This is harder than it seems, especially for native speakers. For example, when I think about avoiding idioms I imagine the phrase "it's raining cats and dogs" which has a fixed non-literal meaning (it's raining intensely) but has an obviously misleading literal meaning. While I agree that this phrase should not appear in scientific writing, I think this is not the phrase you should have in mind when you think about avoiding idioms. If a non-native speaker reads this phrase, it is obviously so absurd that they would probably search for its meaning.
Instead, I suggest imagining the phrase "once upon a time". This is a fixed phrase which most native English speakers probably know, and unlike "raining cats and dogs" it has no obvious "wrong" interpretation. However, read literally word-by-word, the phrase does not make much sense. Additionally, it might be difficult for non-native speakers to identify the fixed phrase within a larger sentence. Imagine the sentence:
Once upon a time chemists wanted to transform lead into gold.
Native speakers will parse this as:
(Once upon a time) (chemists wanted to transform lead into gold).
However, a non-native speaker may try to parse it as:
Once (upon a time chemists wanted) to transform lead into gold.
Does this mean that one time, at a time which chemists wanted, they transformed lead into gold? That doesn't make sense. Maybe:
(Once upon) (a time chemists wanted) (to transform lead into gold).
Does this mean that when a specified desired time was reached, chemists transformed lead into gold? Again, this doesn't make sense. What about:
(Once upon a time chemists) wanted to transform lead into gold.
Some kind of special chemists wanted to transform lead into gold?
Hopefully this gives you a sense of how non-native speakers may struggle to figure out what to look up.
Some other idioms to avoid
Here are some idioms which I use very naturally as a native speaker, but are probably confusing to non-native speakers:
- "few and far between": far between what? Just write "rarely".
- "time and time again"/"from time to time"
- "on the go"/"on the run"
- "fall flat": be received poorly / not be understood
- "keep in mind": I think "make a mental note" is clearer
- "bird's eye view": overview is clearer
References to idioms
References to idioms do not make sense unless the reader knows the idiom. I highlight this specifically because references to idioms are fairly common, but it may not occur to native English speakers that they are making a reference.
Example: a biologist might encourage others to distribute their cells between multiple freezers to "avoid putting all your cells in one basket". This is a reference to the idiom "don't put all your eggs in one basket", which generally means "don't allow one accident (like dropping the basket) to ruin everything you have (like break all your eggs)". However, the recommendation does not make sense if the reader does not know this idiom. I suggest just stating the consequence literally, eg "avoid losing all your cells if one freezer loses power".
Potential objections to these rules, and my responses to these objections
- These rules encourage dull writing: it is possible to write engaging text without breaking these rules. However, the point of science is not to be exciting or entertaining to read. If you want this then go read a fiction book!
- Idioms and cultural references make text more interesting to read, I want to use them: first, remember that the interest is lost if you don't get the references. Second, in my rules you can still include them if you really want, just using quotations and footnotes to explain them.
- Large language models largely solve this problem: true, you can ask an LLM to explain the idioms and cultural references in a sentence. I tried ChatGPT a few times and it did a good job. However, I don't think it is kind to your readers to make them use ChatGPT to understand the text of your paper, especially when it is mostly a stylistic choice.
- We should not "dumb down" our English writing, foreign scientists should instead improve their English: this is true to a degree. For example, I would not advocate using the word "non-stop" instead of "continuous" just because "continuous" is a more difficult word. However, things like metaphors and cultural references are largely optional in scientific writing and are somewhat unrelated to language skill (in my opinion). Try to empathize with a 50 year old Japanese scientist working in Japan. Their life is entirely in Japanese, except for reading/writing papers in English and speaking to foreigners in English at international conferences approximately once per year. How much time do you want them to dedicate to learning advanced English to understand your "literary" writing?
Conclusions
These rules all stem from my personal experience studying foreign languages (mostly Chinese and French). Basically, try to write text that is easy to read and eliminate unnecessary complexity whenever possible.