A Medley of Potpourri

Monday, August 22, 2022

Rain

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Rain

Hard rain on a roof

Rain is water droplets that have condensed from atmospheric water vapor and then fall under gravity. Rain is a major component of the water cycle and is responsible for depositing most of the fresh water on the Earth. It provides water for hydroelectric power plants, crop irrigation, and suitable conditions for many types of ecosystems.

The major cause of rain production is moisture moving along three-dimensional zones of temperature and moisture contrasts known as weather fronts. If enough moisture and upward motion is present, precipitation falls from convective clouds (those with strong upward vertical motion) such as cumulonimbus (thunder clouds) which can organize into narrow rainbands. In mountainous areas, heavy precipitation is possible where upslope flow is maximized within windward sides of the terrain at elevation which forces moist air to condense and fall out as rainfall along the sides of mountains. On the leeward side of mountains, desert climates can exist due to the dry air caused by downslope flow which causes heating and drying of the air mass. The movement of the monsoon trough, or intertropical convergence zone, brings rainy seasons to savannah climes.

The urban heat island effect leads to increased rainfall, both in amounts and intensity, downwind of cities. Global warming is also causing changes in the precipitation pattern globally, including wetter conditions across eastern North America and drier conditions in the tropics. Antarctica is the driest continent. The globally averaged annual precipitation over land is 715 mm (28.1 in), but over the whole Earth it is much higher at 990 mm (39 in). Climate classification systems such as the Köppen classification system use average annual rainfall to help differentiate between differing climate regimes. Rainfall is measured using rain gauges. Rainfall amounts can be estimated by weather radar.

Rain is also known or suspected on other planets, where it may be composed of methane, neon, sulfuric acid, or even iron rather than water.

Formation

Water-saturated air

Rain falling on a field, in southern Estonia

Air contains water vapor, and the amount of water in a given mass of dry air, known as the mixing ratio, is measured in grams of water per kilogram of dry air (g/kg). The amount of moisture in air is also commonly reported as relative humidity; which is the percentage of the total water vapor air can hold at a particular air temperature. How much water vapor a parcel of air can contain before it becomes saturated (100% relative humidity) and forms into a cloud (a group of visible and tiny water and ice particles suspended above the Earth's surface) depends on its temperature. Warmer air can contain more water vapor than cooler air before becoming saturated. Therefore, one way to saturate a parcel of air is to cool it. The dew point is the temperature to which a parcel must be cooled in order to become saturated.

Streets in Tampere, Finland watered by night rain.

There are four main mechanisms for cooling the air to its dew point: adiabatic cooling, conductive cooling, radiational cooling, and evaporative cooling. Adiabatic cooling occurs when air rises and expands. The air can rise due to convection, large-scale atmospheric motions, or a physical barrier such as a mountain (orographic lift). Conductive cooling occurs when the air comes into contact with a colder surface, usually by being blown from one surface to another, for example from a liquid water surface to colder land. Radiational cooling occurs due to the emission of infrared radiation, either by the air or by the surface underneath. Evaporative cooling occurs when moisture is added to the air through evaporation, which forces the air temperature to cool to its wet-bulb temperature, or until it reaches saturation.

The main ways water vapor is added to the air are: wind convergence into areas of upward motion, precipitation or virga falling from above, daytime heating evaporating water from the surface of oceans, water bodies or wet land, transpiration from plants, cool or dry air moving over warmer water, and lifting air over mountains. Water vapor normally begins to condense on condensation nuclei such as dust, ice, and salt in order to form clouds. Elevated portions of weather fronts (which are three-dimensional in nature) force broad areas of upward motion within the Earth's atmosphere which form clouds decks such as altostratus or cirrostratus. Stratus is a stable cloud deck which tends to form when a cool, stable air mass is trapped underneath a warm air mass. It can also form due to the lifting of advection fog during breezy conditions.

Coalescence and fragmentation

The shape of rain drops depending upon their size

Coalescence occurs when water droplets fuse to create larger water droplets. Air resistance typically causes the water droplets in a cloud to remain stationary. When air turbulence occurs, water droplets collide, producing larger droplets.

Black Rain Clouds

As these larger water droplets descend, coalescence continues, so that drops become heavy enough to overcome air resistance and fall as rain. Coalescence generally happens most often in clouds above freezing, and is also known as the warm rain process. In clouds below freezing, when ice crystals gain enough mass they begin to fall. This generally requires more mass than coalescence when occurring between the crystal and neighboring water droplets. This process is temperature dependent, as supercooled water droplets only exist in a cloud that is below freezing. In addition, because of the great temperature difference between cloud and ground level, these ice crystals may melt as they fall and become rain.

Raindrops have sizes ranging from 0.1 to 9 mm (0.0039 to 0.3543 in) mean diameter but develop a tendency to break up at larger sizes. Smaller drops are called cloud droplets, and their shape is spherical. As a raindrop increases in size, its shape becomes more oblate, with its largest cross-section facing the oncoming airflow. Large rain drops become increasingly flattened on the bottom, like hamburger buns; very large ones are shaped like parachutes. Contrary to popular belief, their shape does not resemble a teardrop. The biggest raindrops on Earth were recorded over Brazil and the Marshall Islands in 2004 — some of them were as large as 10 mm (0.39 in). The large size is explained by condensation on large smoke particles or by collisions between drops in small regions with particularly high content of liquid water.

Rain drops associated with melting hail tend to be larger than other rain drops.

A raindrop on a leaf

Intensity and duration of rainfall are usually inversely related, i.e., high intensity storms are likely to be of short duration and low intensity storms can have a long duration.

Droplet size distribution

The final droplet size distribution is an exponential distribution. The number of droplets with diameter between $d$ and $D+dD$ per unit volume of space is $n(d)=n_{0}e^{-d/\langle d\rangle }dD$ . This is commonly referred to as the Marshall–Palmer law after the researchers who first characterized it. The parameters are somewhat temperature-dependent, and the slope also scales with the rate of rainfall $\langle d\rangle ^{-1}=41R^{-0.21}$ (d in centimeters and R in millimeters per hour).

Deviations can occur for small droplets and during different rainfall conditions. The distribution tends to fit averaged rainfall, while instantaneous size spectra often deviate and have been modeled as gamma distributions. The distribution has an upper limit due to droplet fragmentation.

Raindrop impacts

Raindrops impact at their terminal velocity, which is greater for larger drops due to their larger mass to drag ratio. At sea level and without wind, 0.5 mm (0.020 in) drizzle impacts at 2 m/s (6.6 ft/s) or 7.2 km/h (4.5 mph), while large 5 mm (0.20 in) drops impact at around 9 m/s (30 ft/s) or 32 km/h (20 mph).

Rain falling on loosely packed material such as newly fallen ash can produce dimples that can be fossilized, called raindrop impressions. The air density dependence of the maximum raindrop diameter together with fossil raindrop imprints has been used to constrain the density of the air 2.7 billion years ago.

The sound of raindrops hitting water is caused by bubbles of air oscillating underwater.

The METAR code for rain is RA, while the coding for rain showers is SHRA.

Virga

In certain conditions precipitation may fall from a cloud but then evaporate or sublime before reaching the ground. This is termed virga and is more often seen in hot and dry climates.

Causes

Frontal activity

Stratiform (a broad shield of precipitation with a relatively similar intensity) and dynamic precipitation (convective precipitation which is showery in nature with large changes in intensity over short distances) occur as a consequence of slow ascent of air in synoptic systems (on the order of cm/s), such as in the vicinity of cold fronts and near and poleward of surface warm fronts. Similar ascent is seen around tropical cyclones outside the eyewall, and in comma-head precipitation patterns around mid-latitude cyclones. A wide variety of weather can be found along an occluded front, with thunderstorms possible, but usually their passage is associated with a drying of the air mass. Occluded fronts usually form around mature low-pressure areas. What separates rainfall from other precipitation types, such as ice pellets and snow, is the presence of a thick layer of air aloft which is above the melting point of water, which melts the frozen precipitation well before it reaches the ground. If there is a shallow near surface layer that is below freezing, freezing rain (rain which freezes on contact with surfaces in subfreezing environments) will result. Hail becomes an increasingly infrequent occurrence when the freezing level within the atmosphere exceeds 3,400 m (11,000 ft) above ground level.

Convection

Diagram showing that as moist air becomes heated more than its surroundings, it moves upward, resulting in brief rain showers.

Convective precipitation

Diagram showing how moist air over the ocean rises and flows over the land, causing cooling and rain as it hits mountain ridges.

Orographic precipitation

Convective rain, or showery precipitation, occurs from convective clouds (e.g., cumulonimbus or cumulus congestus). It falls as showers with rapidly changing intensity. Convective precipitation falls over a certain area for a relatively short time, as convective clouds have limited horizontal extent. Most precipitation in the tropics appears to be convective; however, it has been suggested that stratiform precipitation also occurs. Graupel and hail indicate convection. In mid-latitudes, convective precipitation is intermittent and often associated with baroclinic boundaries such as cold fronts, squall lines, and warm fronts.

Orographic effects

Orographic precipitation occurs on the windward side of mountains and is caused by the rising air motion of a large-scale flow of moist air across the mountain ridge, resulting in adiabatic cooling and condensation. In mountainous parts of the world subjected to relatively consistent winds (for example, the trade winds), a more moist climate usually prevails on the windward side of a mountain than on the leeward or downwind side. Moisture is removed by orographic lift, leaving drier air (see katabatic wind) on the descending and generally warming, leeward side where a rain shadow is observed.

In Hawaii, Mount Waiʻaleʻale, on the island of Kauai, is notable for its extreme rainfall, as it is amongst the places in the world with the highest levels of rainfall, with 9,500 mm (373 in). Systems known as Kona storms affect the state with heavy rains between October and April. Local climates vary considerably on each island due to their topography, divisible into windward (Koʻolau) and leeward (Kona) regions based upon location relative to the higher mountains. Windward sides face the east to northeast trade winds and receive much more rainfall; leeward sides are drier and sunnier, with less rain and less cloud cover.

In South America, the Andes mountain range blocks Pacific moisture that arrives in that continent, resulting in a desertlike climate just downwind across western Argentina. The Sierra Nevada range creates the same effect in North America forming the Great Basin and Mojave Deserts.

Within the tropics

Rainfall distribution by month in Cairns showing the extent of the wet season at that location

The wet, or rainy, season is the time of year, covering one or more months, when most of the average annual rainfall in a region falls. The term green season is also sometimes used as a euphemism by tourist authorities. Areas with wet seasons are dispersed across portions of the tropics and subtropics. Savanna climates and areas with monsoon regimes have wet summers and dry winters. Tropical rainforests technically do not have dry or wet seasons, since their rainfall is equally distributed through the year. Some areas with pronounced rainy seasons will see a break in rainfall mid-season when the intertropical convergence zone or monsoon trough move poleward of their location during the middle of the warm season. When the wet season occurs during the warm season, or summer, rain falls mainly during the late afternoon and early evening hours. The wet season is a time when air quality improves, freshwater quality improves, and vegetation grows significantly.

Tropical cyclones, a source of very heavy rainfall, consist of large air masses several hundred miles across with low pressure at the centre and with winds blowing inward towards the centre in either a clockwise direction (southern hemisphere) or counter clockwise (northern hemisphere). Although cyclones can take an enormous toll in lives and personal property, they may be important factors in the precipitation regimes of places they impact, as they may bring much-needed precipitation to otherwise dry regions. Areas in their path can receive a year's worth of rainfall from a tropical cyclone passage.

Human influence

Image of Atlanta, US showing temperature distribution, with blue showing cool temperatures, red warm, and hot areas appearing white.

Average surface air temperatures from 2011 to 2020 compared to the 1951–1980 average. Source: NASA

The fine particulate matter produced by car exhaust and other human sources of pollution forms cloud condensation nuclei, leads to the production of clouds and increases the likelihood of rain. As commuters and commercial traffic cause pollution to build up over the course of the week, the likelihood of rain increases: it peaks by Saturday, after five days of weekday pollution has been built up. In heavily populated areas that are near the coast, such as the United States' Eastern Seaboard, the effect can be dramatic: there is a 22% higher chance of rain on Saturdays than on Mondays. The urban heat island effect warms cities 0.6 to 5.6 °C (1.1 to 10.1 °F) above surrounding suburbs and rural areas. This extra heat leads to greater upward motion, which can induce additional shower and thunderstorm activity. Rainfall rates downwind of cities are increased between 48% and 116%. Partly as a result of this warming, monthly rainfall is about 28% greater between 32 to 64 km (20 to 40 mi) downwind of cities, compared with upwind. Some cities induce a total precipitation increase of 51%.

Increasing temperatures tend to increase evaporation which can lead to more precipitation. Precipitation generally increased over land north of 30°N from 1900 through 2005 but has declined over the tropics since the 1970s. Globally there has been no statistically significant overall trend in precipitation over the past century, although trends have varied widely by region and over time. Eastern portions of North and South America, northern Europe, and northern and central Asia have become wetter. The Sahel, the Mediterranean, southern Africa and parts of southern Asia have become drier. There has been an increase in the number of heavy precipitation events over many areas during the past century, as well as an increase since the 1970s in the prevalence of droughts—especially in the tropics and subtropics. Changes in precipitation and evaporation over the oceans are suggested by the decreased salinity of mid- and high-latitude waters (implying more precipitation), along with increased salinity in lower latitudes (implying less precipitation and/or more evaporation). Over the contiguous United States, total annual precipitation increased at an average rate of 6.1 percent since 1900, with the greatest increases within the East North Central climate region (11.6 percent per century) and the South (11.1 percent). Hawaii was the only region to show a decrease (−9.25 percent).

Analysis of 65 years of United States of America rainfall records show the lower 48 states have an increase in heavy downpours since 1950. The largest increases are in the Northeast and Midwest, which in the past decade, have seen 31 and 16 percent more heavy downpours compared to the 1950s. Rhode Island is the state with the largest increase, 104%. McAllen, Texas is the city with the largest increase, 700%. Heavy downpour in the analysis are the days where total precipitation exceeded the top one percent of all rain and snow days during the years 1950–2014.

The most successful attempts at influencing weather involve cloud seeding, which include techniques used to increase winter precipitation over mountains and suppress hail.

Characteristics

Patterns

Band of thunderstorms seen on a weather radar display

Rainbands are cloud and precipitation areas which are significantly elongated. Rainbands can be stratiform or convective, and are generated by differences in temperature. When noted on weather radar imagery, this precipitation elongation is referred to as banded structure. Rainbands in advance of warm occluded fronts and warm fronts are associated with weak upward motion, and tend to be wide and stratiform in nature.

Rainbands spawned near and ahead of cold fronts can be squall lines which are able to produce tornadoes. Rainbands associated with cold fronts can be warped by mountain barriers perpendicular to the front's orientation due to the formation of a low-level barrier jet. Bands of thunderstorms can form with sea breeze and land breeze boundaries, if enough moisture is present. If sea breeze rainbands become active enough just ahead of a cold front, they can mask the location of the cold front itself.

Once a cyclone occludes an occluded front (a trough of warm air aloft) will be caused by strong southerly winds on its eastern periphery rotating aloft around its northeast, and ultimately northwestern, periphery (also termed the warm conveyor belt), forcing a surface trough to continue into the cold sector on a similar curve to the occluded front. The front creates the portion of an occluded cyclone known as its comma head, due to the comma-like shape of the mid-tropospheric cloudiness that accompanies the feature. It can also be the focus of locally heavy precipitation, with thunderstorms possible if the atmosphere along the front is unstable enough for convection. Banding within the comma head precipitation pattern of an extratropical cyclone can yield significant amounts of rain. Behind extratropical cyclones during fall and winter, rainbands can form downwind of relative warm bodies of water such as the Great Lakes. Downwind of islands, bands of showers and thunderstorms can develop due to low level wind convergence downwind of the island edges. Offshore California, this has been noted in the wake of cold fronts.

Rainbands within tropical cyclones are curved in orientation. Tropical cyclone rainbands contain showers and thunderstorms that, together with the eyewall and the eye, constitute a hurricane or tropical storm. The extent of rainbands around a tropical cyclone can help determine the cyclone's intensity.

Acidity

Sources of acid rain

The phrase acid rain was first used by Scottish chemist Robert Augus Smith in 1852. The pH of rain varies, especially due to its origin. On America's East Coast, rain that is derived from the Atlantic Ocean typically has a pH of 5.0–5.6; rain that comes across the continental from the west has a pH of 3.8–4.8; and local thunderstorms can have a pH as low as 2.0. Rain becomes acidic primarily due to the presence of two strong acids, sulfuric acid (H₂SO₄) and nitric acid (HNO₃). Sulfuric acid is derived from natural sources such as volcanoes, and wetlands (sulfate reducing bacteria); and anthropogenic sources such as the combustion of fossil fuels, and mining where H₂S is present. Nitric acid is produced by natural sources such as lightning, soil bacteria, and natural fires; while also produced anthropogenically by the combustion of fossil fuels and from power plants. In the past 20 years the concentrations of nitric and sulfuric acid has decreased in presence of rainwater, which may be due to the significant increase in ammonium (most likely as ammonia from livestock production), which acts as a buffer in acid rain and raises the pH.

Köppen climate classification

Updated Köppen–Geiger climate map

BWh

BWk

BSh

BSk

Csa

Csb

Cwa

Cwb

Cfa

Cfb

Cfc

Dsa

Dsb

Dsc

Dsd

Dwa

Dwb

Dwc

Dwd

Dfa

Dfb

Dfc

Dfd

The Köppen classification depends on average monthly values of temperature and precipitation. The most commonly used form of the Köppen classification has five primary types labeled A through E. Specifically, the primary types are A, tropical; B, dry; C, mild mid-latitude; D, cold mid-latitude; and E, polar. The five primary classifications can be further divided into secondary classifications such as rain forest, monsoon, tropical savanna, humid subtropical, humid continental, oceanic climate, Mediterranean climate, steppe, subarctic climate, tundra, polar ice cap, and desert.

Rain forests are characterized by high rainfall, with definitions setting minimum normal annual rainfall between 1,750 and 2,000 mm (69 and 79 in). A tropical savanna is a grassland biome located in semi-arid to semi-humid climate regions of subtropical and tropical latitudes, with rainfall between 750 and 1,270 mm (30 and 50 in) a year. They are widespread on Africa, and are also found in India, the northern parts of South America, Malaysia, and Australia. The humid subtropical climate zone is where winter rainfall is associated with large storms that the westerlies steer from west to east. Most summer rainfall occurs during thunderstorms and from occasional tropical cyclones. Humid subtropical climates lie on the east side continents, roughly between latitudes 20° and 40° degrees away from the equator.

An oceanic (or maritime) climate is typically found along the west coasts at the middle latitudes of all the world's continents, bordering cool oceans, as well as southeastern Australia, and is accompanied by plentiful precipitation year-round. The Mediterranean climate regime resembles the climate of the lands in the Mediterranean Basin, parts of western North America, parts of Western and South Australia, in southwestern South Africa and in parts of central Chile. The climate is characterized by hot, dry summers and cool, wet winters. A steppe is a dry grassland. Subarctic climates are cold with continuous permafrost and little precipitation.

Measurement

Gauges

Standard rain gauge

Rain is measured in units of length per unit time, typically in millimeters per hour, or in countries where imperial units are more common, inches per hour. The "length", or more accurately, "depth" being measured is the depth of rain water that would accumulate on a flat, horizontal and impermeable surface during a given amount of time, typically an hour. One millimeter of rainfall is the equivalent of one liter of water per square meter.

The standard way of measuring rainfall or snowfall is the standard rain gauge, which can be found in 100-mm (4-in) plastic and 200-mm (8-in) metal varieties. The inner cylinder is filled by 25 mm (0.98 in) of rain, with overflow flowing into the outer cylinder. Plastic gauges have markings on the inner cylinder down to 0.25 mm (0.0098 in) resolution, while metal gauges require use of a stick designed with the appropriate 0.25 mm (0.0098 in) markings. After the inner cylinder is filled, the amount inside it is discarded, then filled with the remaining rainfall in the outer cylinder until all the fluid in the outer cylinder is gone, adding to the overall total until the outer cylinder is empty. Other types of gauges include the popular wedge gauge (the cheapest rain gauge and most fragile), the tipping bucket rain gauge, and the weighing rain gauge. For those looking to measure rainfall the most inexpensively, a can that is cylindrical with straight sides will act as a rain gauge if left out in the open, but its accuracy will depend on what ruler is used to measure the rain with. Any of the above rain gauges can be made at home, with enough know-how.

When a precipitation measurement is made, various networks exist across the United States and elsewhere where rainfall measurements can be submitted through the Internet, such as CoCoRAHS or GLOBE. If a network is not available in the area where one lives, the nearest local weather or met office will likely be interested in the measurement.

Remote sensing

Twenty-four-hour rainfall accumulation on the Val d'Irène radar in Eastern Canada. Zones without data in the east and southwest are caused by beam blocking from mountains. (Source: Environment Canada)

One of the main uses of weather radar is to be able to assess the amount of precipitations fallen over large basins for hydrological purposes. For instance, river flood control, sewer management and dam construction are all areas where planners use rainfall accumulation data. Radar-derived rainfall estimates complement surface station data which can be used for calibration. To produce radar accumulations, rain rates over a point are estimated by using the value of reflectivity data at individual grid points. A radar equation is then used, which is

Z=AR^{b},

where Z represents the radar reflectivity, R represents the rainfall rate, and A and b are constants. Satellite derived rainfall estimates use passive microwave instruments aboard polar orbiting as well as geostationary weather satellites to indirectly measure rainfall rates. If one wants an accumulated rainfall over a time period, one has to add up all the accumulations from each grid box within the images during that time.

Intensity

Rainfall intensity is classified according to the rate of precipitation, which depends on the considered time. The following categories are used to classify rainfall intensity:

Light rain — when the precipitation rate is < 2.5 mm (0.098 in) per hour
Moderate rain — when the precipitation rate is between 2.5 mm (0.098 in) – 7.6 mm (0.30 in) or 10 mm (0.39 in) per hour
Heavy rain — when the precipitation rate is > 7.6 mm (0.30 in) per hour, or between 10 mm (0.39 in) and 50 mm (2.0 in) per hour
Violent rain — when the precipitation rate is > 50 mm (2.0 in) per hour

Euphemisms for a heavy or violent rain include gully washer, trash-mover and toad-strangler. The intensity can also be expressed by rainfall erosivity R-factor or in terms of the rainfall time-structure n-index.

Return period

The average time between occurrences of an event with a specified intensity and duration is called the return period. The intensity of a storm can be predicted for any return period and storm duration, from charts based on historic data for the location. The return period is often expressed as an n-year event. For instance, a 10-year storm describes a rare rainfall event occurring on average once every 10 years. The rainfall will be greater and the flooding will be worse than the worst storm expected in any single year. A 100-year storm describes an extremely rare rainfall event occurring on average once in a century. The rainfall will be extreme and flooding worse than a 10-year event. The probability of an event in any year is the inverse of the return period (assuming the probability remains the same for each year). For instance a 10-year storm has a probability of occurring of 10 percent in any given year, and a 100-year storm occurs with a 1 percent probability in a year. As with all probability events, it is possible, though improbable, to have multiple 100-year storms in a single year.

Forecasting

Example of a five-day rainfall forecast from the Hydrometeorological Prediction Center

The Quantitative Precipitation Forecast (abbreviated QPF) is the expected amount of liquid precipitation accumulated over a specified time period over a specified area. A QPF will be specified when a measurable precipitation type reaching a minimum threshold is forecast for any hour during a QPF valid period. Precipitation forecasts tend to be bound by synoptic hours such as 0000, 0600, 1200 and 1800 GMT. Terrain is considered in QPFs by use of topography or based upon climatological precipitation patterns from observations with fine detail. Starting in the mid to late 1990s, QPFs were used within hydrologic forecast models to simulate impact to rivers throughout the United States. Forecast models show significant sensitivity to humidity levels within the planetary boundary layer, or in the lowest levels of the atmosphere, which decreases with height. QPF can be generated on a quantitative, forecasting amounts, or a qualitative, forecasting the probability of a specific amount, basis. Radar imagery forecasting techniques show higher skill than model forecasts within 6 to 7 hours of the time of the radar image. The forecasts can be verified through use of rain gauge measurements, weather radar estimates, or a combination of both. Various skill scores can be determined to measure the value of the rainfall forecast.

Impact

Agricultural

Rainfall estimates for southern Japan and the surrounding region from July 20–27, 2009.

Precipitation, especially rain, has a dramatic effect on agriculture. All plants need at least some water to survive, therefore rain (being the most effective means of watering) is important to agriculture. While a regular rain pattern is usually vital to healthy plants, too much or too little rainfall can be harmful, even devastating to crops. Drought can kill crops and increase erosion, while overly wet weather can cause harmful fungus growth. Plants need varying amounts of rainfall to survive. For example, certain cacti require small amounts of water, while tropical plants may need up to hundreds of inches of rain per year to survive.

In areas with wet and dry seasons, soil nutrients diminish and erosion increases during the wet season. Animals have adaptation and survival strategies for the wetter regime. The previous dry season leads to food shortages into the wet season, as the crops have yet to mature. Developing countries have noted that their populations show seasonal weight fluctuations due to food shortages seen before the first harvest, which occurs late in the wet season. Rain may be harvested through the use of rainwater tanks; treated to potable use or for non-potable use indoors or for irrigation. Excessive rain during short periods of time can cause flash floods.

Culture and religion

A rain dance being performed in Harar, Ethiopia

Cultural attitudes towards rain differ across the world. In temperate climates, people tend to be more stressed when the weather is unstable or cloudy, with its impact greater on men than women. Rain can also bring joy, as some consider it to be soothing or enjoy the aesthetic appeal of it. In dry places, such as India, or during periods of drought, rain lifts people's moods. In Botswana, the Setswana word for rain, pula, is used as the name of the national currency, in recognition of the economic importance of rain in its country, since it has a desert climate. Several cultures have developed means of dealing with rain and have developed numerous protection devices such as umbrellas and raincoats, and diversion devices such as gutters and storm drains that lead rains to sewers. Many people find the scent during and immediately after rain pleasant or distinctive. The source of this scent is petrichor, an oil produced by plants, then absorbed by rocks and soil, and later released into the air during rainfall.

Rain, depicted in the 1493 Nuremberg Chronicle

Rain holds an important religious significance in many cultures. The ancient Sumerians believed that rain was the semen of the sky-god An, which fell from the heavens to inseminate his consort, the earth-goddess Ki, causing her to give birth to all the plants of the earth. The Akkadians believed that the clouds were the breasts of Anu's consort Antu and that rain was milk from her breasts. According to Jewish tradition, in the first century BC, the Jewish miracle-worker Honi ha-M'agel ended a three-year drought in Judaea by drawing a circle in the sand and praying for rain, refusing to leave the circle until his prayer was granted. In his Meditations, the Roman emperor Marcus Aurelius preserves a prayer for rain made by the Athenians to the Greek sky-god Zeus. Various Native American tribes are known to have historically conducted rain dances in effort to encourage rainfall. Rainmaking rituals are also important in many African cultures. In the present-day United States, various state governors have held Days of Prayer for rain, including the Days of Prayer for Rain in the State of Texas in 2011.

Global climatology

Approximately 505,000 km³ (121,000 cu mi) of water falls as precipitation each year across the globe with 398,000 km³ (95,000 cu mi) of it over the oceans. Given the Earth's surface area, that means the globally averaged annual precipitation is 990 mm (39 in). Deserts are defined as areas with an average annual precipitation of less than 250 mm (10 in) per year, or as areas where more water is lost by evapotranspiration than falls as precipitation.

Deserts

Largest deserts

Isolated towering vertical desert shower

The northern half of Africa is dominated by the world's most extensive hot, dry region, the Sahara Desert. Some deserts also occupy much of southern Africa: the Namib and the Kalahari. Across Asia, a large annual rainfall minimum, composed primarily of deserts, stretches from the Gobi Desert in Mongolia west-southwest through western Pakistan (Balochistan) and Iran into the Arabian Desert in Saudi Arabia. Most of Australia is semi-arid or desert, making it the world's driest inhabited continent. In South America, the Andes mountain range blocks Pacific moisture that arrives in that continent, resulting in a desertlike climate just downwind across western Argentina. The drier areas of the United States are regions where the Sonoran Desert overspreads the Desert Southwest, the Great Basin and central Wyoming.

Polar deserts

Since rain only falls as liquid, it rarely falls when surface temperatures are below freezing, unless there is a layer of warm air aloft, in which case it becomes freezing rain. Due to the entire atmosphere being below freezing most of the time, very cold climates see very little rainfall and are often known as polar deserts. A common biome in this area is the tundra which has a short summer thaw and a long frozen winter. Ice caps see no rain at all, making Antarctica the world's driest continent.

Rainforests

Rainforests are areas of the world with very high rainfall. Both tropical and temperate rainforests exist. Tropical rainforests occupy a large band of the planet mostly along the equator. Most temperate rainforests are located on mountainous west coasts between 45 and 55 degrees latitude, but they are often found in other areas.

Around 40–75% of all biotic life is found in rainforests. Rainforests are also responsible for 28% of the world's oxygen turnover.

Monsoons

The equatorial region near the Intertropical Convergence Zone (ITCZ), or monsoon trough, is the wettest portion of the world's continents. Annually, the rain belt within the tropics marches northward by August, then moves back southward into the Southern Hemisphere by February and March. Within Asia, rainfall is favored across its southern portion from India east and northeast across the Philippines and southern China into Japan due to the monsoon advecting moisture primarily from the Indian Ocean into the region. The monsoon trough can reach as far north as the 40th parallel in East Asia during August before moving southward thereafter. Its poleward progression is accelerated by the onset of the summer monsoon which is characterized by the development of lower air pressure (a thermal low) over the warmest part of Asia. Similar, but weaker, monsoon circulations are present over North America and Australia. During the summer, the Southwest monsoon combined with Gulf of California and Gulf of Mexico moisture moving around the subtropical ridge in the Atlantic Ocean bring the promise of afternoon and evening thunderstorms to the southern tier of the United States as well as the Great Plains. The eastern half of the contiguous United States east of the 98th meridian, the mountains of the Pacific Northwest, and the Sierra Nevada range are the wetter portions of the nation, with average rainfall exceeding 760 mm (30 in) per year. Tropical cyclones enhance precipitation across southern sections of the United States, as well as Puerto Rico, the United States Virgin Islands, the Northern Mariana Islands, Guam, and American Samoa.

Impact of the Westerlies

Long-term mean precipitation by month

Westerly flow from the mild north Atlantic leads to wetness across western Europe, in particular Ireland and the United Kingdom, where the western coasts can receive between 1,000 mm (39 in), at sea level and 2,500 mm (98 in), on the mountains of rain per year. Bergen, Norway is one of the more famous European rain-cities with its yearly precipitation of 2,250 mm (89 in) on average. During the fall, winter, and spring, Pacific storm systems bring most of Hawaii and the western United States much of their precipitation. Over the top of the ridge, the jet stream brings a summer precipitation maximum to the Great Lakes. Large thunderstorm areas known as mesoscale convective complexes move through the Plains, Midwest, and Great Lakes during the warm season, contributing up to 10% of the annual precipitation to the region.

The El Niño-Southern Oscillation affects the precipitation distribution, by altering rainfall patterns across the western United States, Midwest, the Southeast, and throughout the tropics. There is also evidence that global warming is leading to increased precipitation to the eastern portions of North America, while droughts are becoming more frequent in the tropics and subtropics.

Wettest known locations

Cherrapunji, situated on the southern slopes of the Eastern Himalaya in Shillong, India is the confirmed wettest place on Earth, with an average annual rainfall of 11,430 mm (450 in). The highest recorded rainfall in a single year was 22,987 mm (905.0 in) in 1861. The 38-year average at nearby Mawsynram, Meghalaya, India is 11,873 mm (467.4 in). The wettest spot in Australia is Mount Bellenden Ker in the north-east of the country which records an average of 8,000 mm (310 in) per year, with over 12,200 mm (480.3 in) of rain recorded during 2000. The Big Bog on the island of Maui has the highest average annual rainfall in the Hawaiian Islands, at 10,300 mm (404 in). Mount Waiʻaleʻale on the island of Kauaʻi achieves similar torrential rains, while slightly lower than that of the Big Bog, at 9,500 mm (373 in) of rain per year over the last 32 years, with a record 17,340 mm (683 in) in 1982. Its summit is considered one of the rainiest spots on earth, with a reported 350 days of rain per year.

Lloró, a town situated in Chocó, Colombia, is probably the place with the largest rainfall in the world, averaging 13,300 mm (523.6 in) per year. The Department of Chocó is extraordinarily humid. Tutunendaó, a small town situated in the same department, is one of the wettest estimated places on Earth, averaging 11,394 mm (448.6 in) per year; in 1974 the town received 26,303 mm (86 ft 3.6 in), the largest annual rainfall measured in Colombia. Unlike Cherrapunji, which receives most of its rainfall between April and September, Tutunendaó receives rain almost uniformly distributed throughout the year. Quibdó, the capital of Chocó, receives the most rain in the world among cities with over 100,000 inhabitants: 9,000 mm (354 in) per year. Storms in Chocó can drop 500 mm (20 in) of rainfall in a day. This amount is more than what falls in many cities in a year's time.

Continent	Highest average		Place	Elevation		Years of record
Continent	in	mm	Place	ft	m	Years of record
South America	523.6	13,299	Lloró, Colombia (estimated)	520	158	29
Asia	467.4	11,872	Mawsynram, India	4,597	1,401	39
Africa	405.0	10,287	Debundscha, Cameroon	30	9.1	32
Oceania	404.3	10,269	Big Bog, Maui, Hawaii (USA)	5,148	1,569	30
South America	354.0	8,992	Quibdo, Colombia	120	36.6	16
Australia	340.0	8,636	Mount Bellenden Ker, Queensland	5,102	1,555	9
North America	256.0	6,502	Hucuktlis Lake, British Columbia	12	3.66	14
Europe	183.0	4,648	Crkvice, Montenegro	3,337	1,017	22
Source (without conversions): Global Measured Extremes of Temperature and Precipitation, National Climatic Data Center. August 9, 2004.

	Continent	Place	Highest rainfall
	Continent	Place	in	mm
Highest average annual rainfall	Asia	Mawsynram, India	467.4	11,870
Highest in one year	Asia	Cherrapunji, India	1,042	26,470
Highest in one calendar month	Asia	Cherrapunji, India	366	9,296
Highest in 24 hours	Indian Ocean	Foc Foc, La Réunion	71.8	1,820
Highest in 12 hours	Indian Ocean	Foc Foc, La Réunion	45.0	1,140
Highest in one minute	North America	Unionville, Maryland, USA	1.23	31.2

Outside Earth

Rainfalls of diamonds have been suggested to occur on the gas giant planets, Jupiter and Saturn, as well as on the ice giant planets, Uranus and Neptune. There is likely to be rain of various compositions in the upper atmospheres of the gas giants, as well as precipitation of liquid neon in the deep atmospheres. On Titan, Saturn's largest natural satellite, infrequent methane rain is thought to carve the moon's numerous surface channels. On Venus, sulfuric acid virga evaporates 25 km (16 mi) from the surface. Extrasolar planet OGLE-TR-56b in the constellation Sagittarius is hypothesized to have iron rain. Accordingly, research carried out by the European Southern Observatory shows that WASP-76b can produce showers of burning liquid iron droplets once temperature decreases during the planet's night hours. There is evidence from samples of basalt brought back by the Apollo missions that the Moon has been subject to lava rain.

Sunday, August 21, 2022

Non-coding DNA

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Non-coding_DNA

Non-coding DNA (ncDNA) sequences are components of an organism's DNA that do not encode protein sequences. Some non-coding DNA is transcribed into functional non-coding RNA molecules (e.g. transfer RNA, microRNA, piRNA, ribosomal RNA, and regulatory RNAs). Other functional regions of the non-coding DNA fraction include regulatory sequences that control gene expression; scaffold attachment regions; origins of DNA replication; centromeres; and telomeres. Some regions appear to be mostly nonfunctional such as introns, pseudogenes, intergenic DNA, and fragments of transposons and viruses. These apparently non-functional regions take up most of the genome of many eukaryotes and many scientists think that they are junk DNA.

Fraction of non-coding genomic DNA

In bacteria, the coding regions typically take up 88% of the genome. The remaining 12% consists largely of non-coding genes and regulatory sequences, which means that almost all of the bacterial genome has a function. The amount of coding DNA in eukaryrotes is usually a much smaller fraction of the genome because eukaryotic genomes contain large amounts of repetitive DNA not found in prokaryotes. The human genome contains somewhere between 1% and 2% coding DNA. (The exact number isn't known because there are disputes over the number of functional coding exons and over the total size of the human genome.) This means that 98-99% of the human genome consists of non-coding DNA and this includes many functional elements such as non-coding genes and regulatory sequences (see below).

Genome size in eukaryotes can vary over a wide range, even between closely related sequences. This puzzling observation was originally known as the C-value Paradox where "C" refers to the haploid genome size. The paradox was resolved with the discovery that most of the differences were due to the expansion and contraction of repetitive DNA and not the number of genes. Some researchers speculated that this repetitive DNA was mostly junk DNA. The reasons for the changes in genome size are still being worked out and this problem is called the C-value Enigma.

This led to the observation that the number of genes does not seem to correlate with perceived notions of complexity because the number of genes seems to be relatively constant - an issue that's called the G-value Paradox. For example, the genome of the unicellular Polychaos dubium (formerly known as Amoeba dubia) has been reported to contain more than 200 times the amount of DNA in humans (i.e. more than 600 billion pairs of bases vs a bit more than 3 billion in humans). The pufferfish Takifugu rubripes genome is only about one eighth the size of the human genome, yet seems to have a comparable number of genes. Genes take up about 30% of the pufferfish genome and the coding DNA is about 10%. (Non-coding DNA = 90%.) The reduced size of the pufferfish genome is due to a reduction in the length of introns and less repetitive DNA.

Utricularia gibba, a bladderwort plant, has a very small nuclear genome (100.2 Mb) compared to most plants. It likely evolved from an ancestral genome that was 1,500 Mb in size. The bladderwort genome has roughly the same number of genes as other plants but the total amount of coding DNA comes to about 30% of the genome. (Neither paper gives a precise number but it can be estimated from the number of genes and the average size of a coding region.)

The remainder of the genome (70% non-coding DNA) consists of promoters and regulatory sequences that are shorter than those in other plant species. The genes contain introns but there are fewer of them and they are smaller than the introns in other plant genomes. There are noncoding genes, including many copies of ribosomal RNA genes. The genome also contains telomere sequences and centromeres as expected. Much of the repetitive DNA seen in other eukaryotes has been deleted from the bladderwort genome since that lineage split from those of other plants. About 59% of the bladderwort genome consists of transposon-related sequences but since the genome is so much smaller than other genomes, this represents a considerable reduction in the amount of this DNA. The authors of the original 2013 article note that claims of additional functional elements in the non-coding DNA of animals ('dark matter') don't seem to apply to plant genomes.

According to a New York Times piece, during the evolution of this species, "... genetic junk that didn’t serve a purpose was expunged, and the necessary stuff was kept." That's because The bladderwort genome consists mostly of functional genes and their regulatory systems whereas the human genome is more than 90% junk DNA. One of the leading investigators on the study, Victor Albert of the University of Buffalo, puts it like this,

"The big story is that only 3 percent of the bladderwort's genetic material is so-called 'junk' DNA," Albert said. "Somehow, this plant has purged most of what makes up plant genomes. What that says is that you can have a perfectly good multicellular plant with lots of different cells, organs, tissue types and flowers, and you can do it without the junk. Junk is not needed."

Types of non-coding DNA sequences

Noncoding genes

There are two types of genes: protein coding genes and noncoding genes. Noncoding genes are an important part of non-coding DNA and they include genes for transfer RNA and ribosomal RNA. These genes were discovered in the 1960s. Prokaryotic genomes contain genes for a number of other noncoding RNAs but noncoding RNA genes are much more common in eukaryotes.

Typical classes of noncoding genes in eukaryotes include genes for small nuclear RNAs (snRNAs), small nucleolar RNAs (sno RNAs), microRNAs (miRNAs), short interfering RNAs (siRNAs), PIWI-interacting RNAs (piRNAs), and long noncoding RNAs (lncRNAs). In addition, there are a number of unique RNA genes that produce catalytic RNAs.

Noncoding genes account for only a few percent of prokaryotic genomes but they can represent a vastly higher fraction in eukaryotic genomes. In humans, the noncoding genes take up at least 6% of the genome, largely because there are hundreds of copies of ribosomal RNA genes. Protein-coding genes occupy about 38% of the genome; a fraction that is much higher than the coding region because genes contain large introns.

The total number of noncoding genes in the human genome is controversial. Some scientists think that there are only about 5,000 noncoding genes while others believe that there may be more than 100,000 (see the article on Non-coding RNA). The difference is largely due to debate over the number of lncRNA genes.

Promoters and regulatory elements

Promoters are DNA segments near the 5' end of the gene where transcription begins. They are the sites where RNA polymerase binds to initiate RNA synthesis. Every gene has a noncoding promoter.

Regulatory elements are sites that control the transcription of a nearby gene. They are almost always sequences where transcription factors bind to DNA and these transcription factors can either activate transcription (activators) or repress transcription (repressors). Regulatory elements were discovered in the 1960s and their general characteristics were worked out in the 1970s by studying specific transcription factors in bacteria and bacteriophage.

Promoters and regulatory sequences represent an abundant class of noncoding DNA but they mostly consist of a collection of relatively short sequences so they don't take up a very large fraction of the genome. The exact amount of regulatory DNA in mammalian genome is unclear because it is difficult to distinguish between spurious transcription factor binding sites and those that are functional. The binding characteristics of typical DNA-binding proteins were characterized in the 1970s and the biochemical properties of transcription factors predict that in cells with large genomes the majority of binding sites will be fortuitous and not biologiacally functional.

Many regulatory sequences occur near promoters, usually upstream of the transcription start site of the gene. Some occur within a gene and a few are located downstream of the transcription termination site. In eukaryotes, there are some regulatory sequences that are located at a considerable distance from the promoter region. These distant regulatory sequences are often called enhancers but there is no rigorous definition of enhancer that distinguishes it from other transcription factor binding sites.

Introns

Illustration of an unspliced pre-mRNA precursor, with five introns and six exons (top). After the introns have been removed via splicing, the mature mRNA sequence is ready for translation (bottom).

introns are the parts of a gene that are transcribed into the precursor RNA sequence, but ultimately removed by RNA splicing during the processing to mature RNA. Introns are found in both types of genes: protein-coding genes and noncoding genes. They are present in prokaryotes but they are much more common in eukaryotic genomes.

Group I and group II introns take up only a small percentage of the genome when they are present. Spliceosomal introns (see Figure) are only found in eukaryotes and they can represent a substantial proportion of the genome. In humans, for example, introns in protein-coding genes cover 37% of the genome. Combining that with about 1% coding sequences means that protein-coding genes occupy about 39% of the human genome. The calculations for noncoding genes are more complicated because there's considerable dispute over the total number of noncoding genes but taking only the well-defined examples means that noncoding genes occupy at least 6% of the genome.

Thus, genes take up 45% of the human genome and most of this is noncoding DNA in introns.

There are good reasons to believe that most of the intron DNA is junk DNA (see the discussion in the separate Wikipedia article on introns).

Untranslated regions

The standard biochemistry and molecular biology textbooks describe non-coding nucleotides in mRNA located between the 5' end of the gene and the translation initiation codon. These regions are called 5'-untranslated regions or 5'-UTRs. Similar regions called 3'-untranslated regions (3'-UTRs) are found at the end of the gene. The 5'-UTRs and 3'UTRs are very short in bacteria but they can be several hundred nucleotides in length in eukaryotes. They contain short elements that control the initiation of translation (5'-UTRs) and transcription termination (3'-UTRs) as well as regulatory elements that may control mRNA stability, processing, and targeting to different regions of the cell.

Origins of replication

DNA synthesis begins at specific sites called origins of replication. These are regions of the genome where the DNA replication machinery is assembled and the DNA is unwound to begin DNA synthesis. In most cases, replication proceeds in both directions from the replication origin.

The main features of replication origins are sequences where specific initiation proteins are bound. A typical replication origin covers about 100-200 base pairs of DNA. Prokaryotes have one origin of replication per chromosome or plasmid but there are usually multiple origins in eukaryotic chromosomes. The human genome contains about 100,000 origins of replication representing about 0.3% of the genome.

Centromeres

Centromeres are the sites where spindle fibers attach to newly replicated chromosomes in order to segregate them into daughter cells when the cell divides. Each eukaryotic chromosome has a single functional centromere that's seen as a constricted region in a condensed metaphase chromosome. Centromeric DNA consists of a number of repetitive DNA sequences that often take up a significant fraction of the genome because each centromere can be millions of base pairs in length. In humans, for example, the sequences of all 24 centromeres have been determined and they account for about 6% of the genome. However, it's unlikely that all of this noncoding DNA is essential since there is considerable variation in the total amount of centromeric DNA in different individuals. Centromeres are another example of functional noncoding DNA sequences that have been known for almost half a century and it's likely that they are more abundant than coding DNA.

Telomeres

Telomeres are regions of repetitive DNA at the end of a chromosome, which provide protection from chromosomal deterioration during DNA replication. Recent studies have shown that telomeres function to aid in its own stability. Telomeric repeat-containing RNA (TERRA) are transcripts derived from telomeres. TERRA has been shown to maintain telomerase activity and lengthen the ends of chromosomes.

Scaffold attachment regions

Both prokaryotic and eukarotic genomes are organized into large loops of protein-bound DNA. In eukaryotes, the bases of the loops are called scaffold attachment regions (SARs) and they consist of stretches of DNA that bind an RNA/protein complex to stabilize the loop. There are about 100,000 loops in the human genome and each one consists of about 100 bp of DNA. The total amount of DNA devoted to SARs accounts for about 0.3% of the human genome.

Pseudogenes

Pseudogenes are mostly former genes that have become non-functional due to mutation but the term also refers to inactive DNA sequences that are derived from RNAs produced by functional genes (processed pseudogenes). Pseudogenes are only a small fraction of noncoding DNA in prokaryotic genomes because they are eliminated by negative selection. In some eukaryotes, however, pseudogenes can accumulate because selection isn't powerful enough to eliminate them (see Nearly neutral theory of molecular evolution).

The human genome contains about 15,000 pseudogenes derived from protein-coding genes and an unknown number derived from noncoding genes. They may cover a substantial fraction of the genome (~5%) since many of them contain former intron sequences, .

Pseudogenes are junk DNA by definition and they evolve at the neutral rate as expected for junk DNA. Some former pseudogenes have secondarily acquired a function and this leads some scientists to speculate that most pseudogenes are not junk because they have a yet-to-be-discovered function.

Repeat sequences, transposons and viral elements

Mobile genetic elements in the cell (left) and how they can be acquired (right)

Transposons and retrotransposons are mobile genetic elements. Retrotransposon repeated sequences, which include long interspersed nuclear elements (LINEs) and short interspersed nuclear elements (SINEs), account for a large proportion of the genomic sequences in many species. Alu sequences, classified as a short interspersed nuclear element, are the most abundant mobile elements in the human genome. Some examples have been found of SINEs exerting transcriptional control of some protein-encoding genes.

Endogenous retrovirus sequences are the product of reverse transcription of retrovirus genomes into the genomes of germ cells. Mutation within these retro-transcribed sequences can inactivate the viral genome.

Over 8% of the human genome is made up of (mostly decayed) endogenous retrovirus sequences, as part of the over 42% fraction that is recognizably derived of retrotransposons, while another 3% can be identified to be the remains of DNA transposons. Much of the remaining half of the genome that is currently without an explained origin is expected to have found its origin in transposable elements that were active so long ago (> 200 million years) that random mutations have rendered them unrecognizable. Genome size variation in at least two kinds of plants is mostly the result of retrotransposon sequences.

Highly repetitive DNA

Highly repetitive DNA consists of short stretches of DNA that are repeated many times in tandem (one after the other). The repeat segments are usually between 2 bp and 10 bp but longer ones are known. Highly repetitive DNA is rare in prokaryotes but common in eukaryotes, especially those with large genomes. It is sometimes called satellite DNA.

Most of the highly repetitive DNA is found in centromeres and telomeres (see above) and most of it is functional although some might be redundant. The other significant fraction resides in short tandem repeats (STRs; also called microsatellites) consisting of short stretches of a simple repeat such as ATC. There are about 350,000 STRs in the human genome and they are scattered throughout the genome with an average length of about 25 repeats.

Variations in the number of STR repeats can cause genetic diseases when they lie within a gene but most of these regions appear to be non-functional junk DNA where the number of repeats can vary considerably from individual to individual. This is why these length differences are used extensively in DNA fingerprinting.

Junk DNA

"Junk DNA" refers broadly to "any DNA sequence that does not play a functional role in development, physiology, or some other organism-level capacity." The term "junk DNA" was used in the 1960s. but it only became widely known in 1972 in a paper by Susumu Ohno. Ohno noted that the mutational load from deleterious mutations placed an upper limit on the number of functional loci that could be expected given a typical mutation rate. He hypothesized that mammalian genomes could not have more than 30,000 loci under selection before the "cost" from the mutational load would cause an inescapable decline in fitness, and eventually extinction. The presence of junk DNA also explained the observation that even closely related species can have widely (orders-of-magnitude) different genome sizes (C-value paradox).

Since the late 1970s it has become apparent that most of the DNA in large genomes finds its origin in the selfish amplification of transposable elements, of which W. Ford Doolittle and Carmen Sapienza in 1980 wrote in the journal Nature: "When a given DNA, or class of DNAs, of unproven phenotypic function can be shown to have evolved a strategy (such as transposition) which ensures its genomic survival, then no other explanation for its existence is necessary." The amount of junk DNA can be expected to depend on the rate of amplification of these elements and the rate at which non-functional DNA is lost. Another source is genome duplication followed by a loss of function due to redundancy. In the same issue of Nature, Leslie Orgel and Francis Crick wrote that junk DNA has "little specificity and conveys little or no selective advantage to the organism".

The term "junk DNA" may provoke a strong reaction and some have recommended using more neutral terminology such as "nonfunctional DNA." Junk DNA is often confused with non-coding DNA but, as documented above, there are substantial fractions of non-coding DNA that have well-defined functions such as regulation, non-coding genes, origins of replication, telomeres, centromeres, and chromatin organizing sites (SARs).

ENCODE Project

The Encyclopedia of DNA Elements (ENCODE) project uncovered, by direct biochemical approaches, that at least 80% of human genomic DNA has biochemical activity such as "transcription, transcription factor association, chromatin structure, and histone modification". Though this was not necessarily unexpected due to previous decades of research discovering many functional non-coding regions, some scientists criticized the conclusion for conflating biochemical activity with biological function. Estimates for the biologically functional fraction of the human genome based on comparative genomics range between 8 and 15%. However, others have argued against relying solely on estimates from comparative genomics due to its limited scope since non-coding DNA has been found to be involved in epigenetic activity and complex networks of genetic interactions and is explored in evolutionary developmental biology. One consistent indication of biological functionality of a genomic region is if the sequence of that genomic region was maintained by purifying selection (or if mutating away the sequence is deleterious to the organism). Under this definition, 90% of the genome is 'junk'. However, some stress that 'junk' is not 'garbage' and the large body of nonfunctional transcripts produced by 'junk DNA' can evolve functional elements de novo.

The meaning of the results have been disputed by other scientists, who argue that neither accessibility of segments of the genome to transcription factors nor their transcription guarantees that those segments have biochemical function and that their transcription is selectively advantageous. After all, non-functional sections of the genome can be transcribed, given that transcription factors typically bind to short sequences that are found (randomly) all over the whole genome.

Furthermore, the much lower estimates of functionality prior to ENCODE were based on genomic conservation estimates across mammalian lineages. Widespread transcription and splicing in the human genome has been discussed as another indicator of genetic function in addition to genomic conservation which may miss poorly conserved functional sequences. Furthermore, much of the apparent junk DNA is involved in epigenetic regulation and appears to be necessary for the development of complex organisms. Genetic approaches may miss functional elements that do not manifest physically on the organism, evolutionary approaches have difficulties using accurate multispecies sequence alignments since genomes of even closely related species vary considerably, and with biochemical approaches, though having high reproducibility, the biochemical signatures do not always automatically signify a function. Kellis et al. noted that 70% of the transcription coverage was less than 1 transcript per cell (and may thus be based on spurious background transcription). On the other hand, they argued that 12–15% fraction of human DNA may be under functional constraint, and may still be an underestimate when lineage-specific constraints are included. Ultimately genetic, evolutionary, and biochemical approaches can all be used in a complementary way to identify regions that may be functional in human biology and disease. Some critics have argued that functionality can only be assessed in reference to an appropriate null hypothesis. In this case, the null hypothesis would be that these parts of the genome are non-functional and have properties, be it on the basis of conservation or biochemical activity, that would be expected of such regions based on our general understanding of molecular evolution and biochemistry. According to these critics, until a region in question has been shown to have additional features, beyond what is expected of the null hypothesis, it should provisionally be labelled as non-functional.

Genome-wide association studies (GWAS) and non-coding DNA

Genome-wide association studies (GWAS) identify linkages between alleles and observable traits such as phenotypes and diseases. Most of the associations are between single-nucleotide polymorphisms (SNPs) and the trait being examined and most of these SNPs are located in non-functional DNA. The association establishes a linkage that helps map the DNA region responsible for the trait but it doesn't necessarily identify the mutations causing the disease or phenotypic difference.

SNPs that are tightly linked to traits are the ones most likely to identify a causal mutation. (The association is referred to as tight linkage disequilibrium.) About 12% of these polymorphisms are found in coding regions; about 40% are located in introns; and most of the rest are found in intergenic regions, including regulatory sequences.

Search This Blog