Meaning of locus gene in the dictionary of foreign expressions. Alleles, loci and markers: what is it? Gene loci

LECTURE 1. Classical and molecular genetics. Basic concepts: trait, phenotype, genotype, gene, locus, allele, homozygote, heterozygote, hemizygote.

ICG SB RAS and FEN NSU, Novosibirsk, 2012

1.1. Classical and molecular genetics

Today's lecture is introductory, we will move on to the specifics later. As in the case of almost any science, it is rather difficult to delineate the boundaries of genetics, and a very general definition " genetics - the science of heredity' is not particularly fruitful. Zhimulev, for example, once said that now genetics is present everywhere - in medicine, forensics, the theory of evolution, archeology, and in genetics itself, even nucleic acids are almost invisible - entirely protein interactions. Thus, he actually put an equal sign between genetics and all modern biology. On the other hand, for about the first two thirds of the 20th century, genetics was perhaps the most isolated and clearly defined area of ​​biology, distinguished primarily by its synthetic methodology, in contrast to the analytical methodology of other branches of biology. In order to find out about the structure of her object, she did not divide it into parts, but judged the parts indirectly, by observing the whole (namely, by observing the behavior of signs in crossings) and relying on mathematics, and was convinced of the correctness of her conclusions, receiving live organisms with predicted properties. Thus, genetics from its very beginning had the ability to create something new, and not just describe the observed. At the same time, in the second half of the 20th century, molecular biology was developing rapidly - at first a purely analytical science, splitting into parts. However, its progress was carried out largely by genetic methods - remember, for example, that the genetic code was established in the experiments of Benzer and Crick using mutations in bacteriophages. However, in this case, the genetics of microorganisms was used, and the progress of "classical" genetics has always been associated with the genetics of eukaryotes.


As a result, molecular biology has received almost exhaustive knowledge of what and how a living organism is arranged. The subjects of molecular biology and genetics overlapped in many respects: both of them studied the transmission and implementation of hereditary information (and a living organism is the realization of hereditary information), however, they moved towards understanding this subject from opposite sides - genetics "from the outside", molecular biology "from the inside ".

In the last third of the twentieth century, molecular biology and genetics, so to speak, met, including in the study of eukaryotes. The speculative objects of genetics have turned into completely specific physical and chemical objects of a known structure, and molecular biology has become a synthetic science, capable of influencing at its own discretion even higher multicellular organisms - for example, genetic modification. Here the boundaries of genetics as a science were erased to indistinguishability - it became impossible to say where molecular biology ends and genetics begins. Moreover, the term “molecular genetics” appeared to designate the resulting synthetic science, as a result of which it became unclear what exactly remained in genetics outside the latter. The genetics of the premolecular period, with all its approaches based on crosses and probability theory, has been given the honorary title of "classical genetics". On the other hand, with this title, she was, as it were, sent into an honorable retirement. One may recall how Watson and Crick refused to discuss their model of DNA structure in their Nature paper because the implications were too large and obvious. At some point, it might seem that all genetics follows from this model.

A paradoxical situation is emerging. All courses in genetics begin with the history of this science. It understands how Mendel worked with peas, what he got and how he interpreted it based on his knowledge, then how Morgan and his school worked with Drosophila, what they got and how they interpreted it. It is impossible to omit both of these topics - Mendel is an example of a person who developed from scratch and brilliantly applied a genetic methodology based on mathematics, and the Morgan school developed the chromosome theory of heredity and, in fact, all classical genetics in the first three decades of the twentieth century. Further, courses in genetics can be divided into two broad classes. Some work out in detail the entire history and internal logic of the development of this science, demonstrating both the power of its methodology and the capabilities of the human mind in speculative penetration into the depths of things. Other courses, having quickly skipped this historical period, proceed to molecular genetics and there consider what is known at the moment about the structure and work of genes. In fact, both types of courses place classical genetics in the past and differ only in the detail of retrospection. It turns out that classical genetics has, as it were, only historical significance. However, its powerful methodology has not gone away and is necessary for a very wide range of studies. If we look at papers with quite molecular biological titles and published in the best journals, we will see that they are all based on extensive material concerning hundreds of individual mutations and their combinations, taking into account the relationship between the nature of mutations and the phenotype that they cause. This is true both for Drosophila or mice, for which huge genetic collections have been collected and special laboratory lines have been created (some about a hundred years ago, others recently), and for humans, where a huge amount of medical-genetic - in fact, population-based - genetic - data associated with hereditary diseases. And the richer this arsenal of knowledge and model organisms, the more elegant the work. All these more than serious studies are impossible without the simultaneous mastery of the methodology of classical and molecular genetics. Therefore, it is best to study these “two genetics” in parallel, no matter how difficult it is to organize.


In modern science, one can also observe examples of how the neglect of "outdated" classical genetics leads to curiosities. For example, a group of European scientists needed to get a heterozygote for a translocation in a pea. (I am now speaking on the basis that you have some idea of ​​what is at stake. If you do not have it, it does not matter, we will consider all this in almost too much detail; for now, we are talking about the need for genetic knowledge). They got it through the fusion of the protoplasts of the parental lines. Regeneration from cell culture in peas is extremely difficult, it is an extremely laborious path. Why did they do it? Apparently, they thought that translocation carriers did not cross with ordinary peas! In fact, problems with reproduction when crossing parents that differ in translocation do arise, but only in the next generation and consist in the loss of only half of the fertility.

But these scientists at least needed a heterozygote. Meanwhile, the general fascination with molecular biology and neglect of classical genetics leads to the fact that the existence of heterozygotes - that is, that in eukaryotes each gene is represented in two copies, which may differ, or may be identical - is often completely forgotten. For example, an article by German authors came to my review, in which they directly read a certain non-coding DNA sequence from 38 dragonfly individuals caught in different regions (Western Europe, Western Siberia, Japan and North America) and found 20 variants of it. It was written as if only one variant was found in each individual. However, if the variability is indeed as high as they claim, then the probability that there is at least one individual in their sample for which both copies of this sequence are the same is not very different from zero. And it wasn't even discussed. After the review, they wrote that in five cases there was a suspicion of heterozygosity. If there really are only five, then they had in their hands the amazing phenomenon of the transformation of heterozygotes into homozygotes through mechanisms that are still unknown, but they did not even seem to understand this.

Phylogeny reconstructions based on certain DNA sequences are now widespread. So, quite often attempts are made to judge, based on the time of divergence between populations, whether these populations belong to the same biological species or to different ones. (Note that it is the divergence time that is estimated, since the studied genes, whose variability is more or less constant over time, are obviously not those genes whose change could be associated with speciation). Meanwhile, the time of divergence generally has little to do with this problem - the moments of acquisition of reproductive isolation by a certain local population, that is, the moments of speciation, occur under certain conditions and usually do not take much time from a paleontological point of view (tens to hundreds of thousands of years), then how populations can diverge for a long time without speciation. The question is precisely to find out whether there is reproductive isolation (at least potential) between populations. To do this, you should see if there is an exchange of genes between them (if it is physically possible) or not. Here it is just very important to find out whether heterozygotes are present at the junction of populations according to the alleles characteristic of each of them, and what is their frequency. But almost no one does this, and whether populations belong to the same or to different species is judged by the level of differences between them, comparing them with differences in those cases that are assumed to be undoubted.

In general, if a single organism (for example, as a representative of its species) can be studied using molecular genetics methods, then as soon as it comes to a multitude of organisms, that is, a population genetic problem arises - and such a problem arises quite often, for example, in population biology and breeding - one cannot do without the approaches of classical genetics. Classical genetics is indispensable in everything that concerns individual differences and the characteristics of many individuals of the same species. This is precisely its element, and it is precisely in it that those current scientists who have replaced classical genetic education with molecular biological education often find themselves helpless.

Based on the foregoing, I see my task in presenting classical genetics not so much in a historical aspect, following the great scientists of the past, but starting from the current state of science, in particular, without abstracting from the knowledge that you have already received in molecular biology courses. and cytology. At the same time, some patterns, discovered as purely empirical at the level of organisms, acquire a completely natural interpretation at the molecular level and look almost trivial. At the same time, these regularities themselves should be clearly understood, since they should be used at the level of organisms. In a sense, such a course in genetics is thought of as something like a "demonstration of tricks with revelations" - where both the trick itself and its background are equally "medical facts". Such a course would be designed to teach a very productive methodology: to go down from a trait to genes and, through understanding the mechanism of their action, climb back to the synthesis of new traits.

As you already understood, at the moment the content of genetics is huge and heterogeneous, so the time allotted to us is hardly enough even for a brief introduction. This forces us to leave behind the scenes the history of genetics as an independent topic, to which a special course should have been devoted.

Unfortunately, none of the existing textbooks corresponds to the ideal of studying genetics at the present stage outlined above - from trait to gene and vice versa, most likely because this science is developing too quickly now. As some compensation for this circumstance, I will try to post my modest lectures on my own website, where they will be available to those to whom I give their address - that is, to you. I would recommend taking as a basis the textbook - Vechtomov "Genetics with the basics of selection." The textbook of Academician Igor Fedorovich Zhimulev "General and Molecular Genetics" is also well known, in which the main emphasis is on molecular genetics, and Leonid Vladimirovich recommends it as a basic textbook. I understand that two basic textbooks is not the most convenient situation for passing the exam. But it does contribute to the understanding of the subject. I can say that I personally am here and generally work at the Institute of Cytology and Genetics solely because I took a course in genetics by Vladimir Aleksandrovich Berdnikov. It was the best course I had ever heard, and it did not correspond to any textbook at all, because V.A. prepared it on the basis of the latest reviews in scientific periodicals, which have not yet been included in any textbooks. Igor Fedorovich also turned his original course of lectures into a textbook.

We will touch on the basics of genetics very thoroughly in order to feel them well. We will start from the very beginning, despite the fact that the most elementary foundations of genetics are covered at school, so that, God forbid, we do not miss something simple, but important. On the other hand, I deal with aunt students who have already taken a course in molecular biology and are currently studying the theory of probability and mathematical statistics, which allows me not to be too distracted by the materials of these courses, which are so necessary for studying genetics. For example, I will assume that you know (or will know at the right time) what is alternative splicing or Poisson distribution.

The standard logic for presenting biology in university courses is to move from the bottom up, from atoms to molecules and macromolecules, then to the structures of the cell, to the life of the cell itself, and then to the multicellular organism. When we know the principles of organizing life to the end, this order of presentation turns out to be organic and natural. These principles also include the mechanism of the functioning of nucleic acids as a carrier of information, primarily about a variety of proteins and functional RNAs (which, after the discovery of small RNAs, turned out to be more diverse than previously thought), not only about their structure, but also about when, where and how many particular RNAs or proteins should be synthesized. The control of these processes is again carried out with the help of certain proteins (and often RNA). There is a cascade principle in the unfolding of genetic control systems - genes code for proteins (RNA) necessary to control genes that code for other proteins (RNA), etc. Since almost everything in the body is “made” by proteins (plus some RNA) , it turns out that, in fact, information about the whole organism is recorded in nucleic acids - however, reading this information is impossible without previously synthesized (again, according to the DNA matrix) proteins that operate on DNA.

This order of presentation completely coincides with the order in which life itself developed. At first, these were some kind of “simple” (but only in comparison with what later emerged from them) systems of self-reproducing macromolecules, apparently, nucleic acids. Then they happened to surround themselves with a phospholipid membrane, which allowed them to build their own microcosm within it. This is how cells were born. Proteins played an increasingly important role in the functioning of these first living beings, but nucleic acids retained full control. Cells became more complex and learned to divide more and more correctly. After division, they sometimes did not disperse, forming colonies. These colonies faced increasingly complex problems due to their size and shape - all the cells in the colony had to be supplied with everything necessary for life. The resolution of these problems was achieved through a certain structure of the colonies and the division of labor between their constituent cells. Simple colonies have turned into states of cells, that is, into multicellular organisms. The problems of their self-reproduction as complex structures were also solved, and this was realized in such a way that each organism could develop from one cell by deploying a complex genetic program that regulates cell division and interaction between them.

However, this standard order of presentation of biological knowledge is diverted from how it was obtained. And they were obtained as science developed in the opposite direction - from organisms to organs, cells, macromolecules and atoms. As they dived into each of these levels, scientists could only make guesses about how the deeper level works. Once upon a time, the maximum they could do was open the body, look at the organs and guess how they work. When the cages were opened, they were first thought to be filled with emptiness. Then they discovered protoplasm, but at first they saw it only as a viscous liquid, in which, however, in some mysterious way the essence of life was contained. Discovered the nucleus and organelles of the cell. They found dyes that color them differently, and thus approached their chemical composition. At the end of the nineteenth century. discovered nucleic acids and figured out their approximate chemical composition, but their specific structure has long remained a mystery, the solution of which looked so brilliant. On this dive into the depths of biology, perhaps, stopped. The period of accumulation of particulars at this deep molecular level has come. There were an unusually large number of particulars. Now we are going through a period when this huge number of details are beginning to be combined into a certain coherent picture - a model of the structure of a living organism. Moreover, this model is so complex that it can not be fully perceived by the human mind, so that not only its construction, but also its visual description and use is impossible without modern computers. However, by the end of the twentieth century. all the basic principles of biology were discovered. Classical genetics, by the efforts of a few talented scientists, developed almost in its entirety during the first three decades of the 20th century as a coherent and logical science.

Classical genetics is just a vivid example of the movement of the researcher from the macro level to the micro level. It reconstructs the scheme of the system from its behavior, approaching it as a black box. As if alien mechanisms of an unknown device fell into the hands of scientists without any schemes and instructions for them. Two main features can be noted. First, this is the amazing depth of reconstruction, which she achieved with a lack of direct information about the structure of the object. The power of the classical genetic approach is impressive: dealing only with visible signs, it made it possible to create an idea about intelligible genes, about their placement in some kind of mysterious linear carriers, about changes in genes and these carriers. Based on the pattern of inheritance of traits, with its help, ideas were obtained about the structure of carriers of genetic information, the transfer of this information to descendants and its transformation into living flesh. The second feature is the already mentioned synthetic rather than analytical nature of genetic knowledge, the validity of which, in the very process of obtaining it, was immediately embodied in the creation of some new one - organisms with new features. It is enough to have a well-studied genetics of a few model objects, then the rest of the objects can be judged according to their similarity. The well-known aphorism of Thomas Morgan “what is true of the fly is true of the elephant”, of course, is a rather strong exaggeration, and we will see this. However, this approach (which also finds its expression in the so-called law of homological series) still works.

Crossing is the main method of classical genetics. Geneticists came to most of their conclusions by observing the behavior of the traits of parents and offspring, and the actions of the researcher with each new generation are determined by the results obtained in the previous one. Therefore, genetic research is a bit like a game of chess. The conclusions drawn from such studies were extremely detailed and, as the further development of science showed, were correct. Gregor Mendel in his experiments on peas at the end of the 19th century. actually postulated the existence and described the behavior of chromosomes in meiosis, without having the slightest idea about chromosomes. The relationship of genes to chromosomes was established only at the beginning of the 20th century, and almost until its middle it was strongly suspected that proteins were the material carrier of heredity. In other words, if other branches of biology were not very detached from the descriptive approach, then genetics in its models was far ahead of the time when the objects it studied could be described as material entities. In the tragic period in the history of Russian science, which fell under the ideological dictate in the 30-50s of the last century, this gave rise to declaring genetics an idealistic pseudoscience and throwing our country, which was at its forefront, far back, and destroying the best geneticists physically.

Such a cognitive power of classical genetics as a science capable of drawing correct conclusions about the behavior of certain cell microstructures based on the behavior of traits in crosses, even without having an idea of ​​what they consist of, is primarily due to the fact that genetics includes a lot of mathematics from its various industries. And this circumstance owes its existence to the fact that the object of genetics is not a certain biological structure, but information. Information can be studied regardless of the material medium on which it is implemented. Thus, a programmer in his work does not need to know how exactly his program will be embodied in the state of crystals in a computer processor, although he is aware that it will be implemented precisely on this physical basis. Genetics is essentially biological informatics. Computer science used to be called cybernetics. And it was another "pseudo-science" that was persecuted under Stalin and Khrushchev, for all the difference between them. (Fortunately, at that time it was not as developed as a branch of mathematics as genetics as a branch of biology, and as a result, less damage was done by this company).

"classical" genetics(sometimes called Mendelian, although what is meant is much broader than what Mendel discovered, and the notorious ideological stigma “Mendelism-Morganism” would be more suitable here) can be defined as the science of heredity, operating with the abstract elements of the organism's development control system, being distracted from their material carrier and, in fact, not needing it. Respectively, molecular genetics can be defined as the science of the molecular mechanisms underlying heredity. I hope it would be superfluous to call not to attach great importance to these and similar formal definitions. In the real scientific practice of "two geneticists", and even more so, there is no border between them, and the above definitions themselves only indicate the general direction of thought...

However, it is known that any definition of anything is imperfect, since our thinking is not mathematical logic and concepts - what our thinking operates on - do not come down to words - that with the help of which we fix and communicate with certain losses. thinking results. Concepts can only understand(with varying degrees of distinctness), observing their interactions with previously witnesses concepts on a set of texts, where concepts are denoted by words. A definition is just the most concise and effective text that brings you closer to understanding, but there will always be situations where any definition does not work (despite the fact that concepts do). Where possible, I try to give definitions that seem to me the most successful, without much concern for how they correspond to those previously proposed or original, but I do not take them too seriously and am very far from the idea that writing them down from dictation and memorization can make it easier understanding of the subject.

At first, genetics consisted of the lonely feat of the only scientist who was not understood by any contemporary and who, by virtue of personal genius and versatile education, himself proposed a fruitful methodology, and scrupulously carried out lengthy and extensive experiments and made an unobvious speculative assumption. Soon after the rediscovery of genetics, that is, its emergence as a science of many, it was discovered that the factors of heredity are located in a strictly defined order and at a certain distance from each other in several linear structures, the number, relative size and behavior of which coincided with the number, relative the size and behavior of chromosomes during meiosis. The chromosome theory of heredity was formulated in 1900-1903. American cytologist William Setton and German embryologist Theodor Boveri and further developed by the famous American geneticist Thomas Morgan and his school - Möller, Sturtevant, Brizhdes. (This was the first time since 1906 that they began to conduct research on Drosophila, and at first they planned rabbits, but this plan was not missed by the financial manager of their university. Charles Woodworth was the first to cultivate Drosophila, he also suggested that it could become a convenient object for the study of heredity.) And this important conclusion about the finding of heredity factors in chromosomes, obtained so early, was rejected by official science in the USSR from the late 1940s to the early 1960s!

Comparison of speculative genetic maps (the relative location of genes in these structures) and various parts of chromosomes made it obvious that the genes are located in them. But this is not so necessary for classical genetics - its models, tested by the results of crossings, put genes in a kind of "virtual chromosomes". So to this day, for most objects, there are two types of chromosome maps: physical cards, showing exactly where on the chromosomes visible under the microscope or on the DNA molecule the genes are located, and genetic, or recombination cards, reconstructing the mutual arrangement of genes based on the results of crossings. The order of the genes in these two types of maps completely coincides, the relative distances between them are far from always, and there are quite exhaustive explanations for this, which will be discussed later.

As a science of information and control, classical genetics even has a structure similar to mathematics. It rests entirely on a system of speculative a priori concepts with which observed phenomena are correlated (in contrast, for example, to cytology, whose conceptual apparatus is introduced on the basis of empirical facts visible to the eye). Unfortunately, in the terminology corresponding to these concepts (and concepts and terms are not the same thing), a certain inconsistency has accumulated during the existence of genetics, which I will specifically focus on so that you are not misled by various word usage in the genetic literature. Of course, genetic concepts are introduced on the basis of observed facts. But the main ones are introduced rather as speculative mathematical concepts. There are many concepts and corresponding terms in genetics. But they are really needed, and, once introduced, they practically exhaust the subject. In many cases, it is enough to compare the observed phenomenon with a suitable concept, and everything becomes clear. Perhaps a good explanatory dictionary of genetic terms could serve as a textbook on genetics. Pedagogically, it would be more correct to introduce the conceptual apparatus and terminology as they become necessary. But there is no harm in introducing and discussing the basic concepts from the very beginning, and then noting the places where they are needed. We will proceed from the fact that you are already familiar with some concepts at least from the school course and sometimes use them even before discussing them in detail.

1.2. signs of organisms. Phenotype and genotype.

Perhaps the most important genetic concept is sign. Genetics as a science began precisely at the moment when Gregor Mendel began to analyze individual traits, and not all heredity as a whole. Can you tell me what is a sign? And how many can there be? A sign is anything associated with an individual, as long as there is a way to somehow register it. Height, weight, color, call height, half length of tail added to the square root of a third of nose length, number of hairs in beard, shape of burrow or anthill, number of males chasing one female, length of time during which you can not breathe underwater , the number of lovers the mother or daughter of the studied subject has. I'm not joking - among the signs of carriers of a certain variant of one of the dopamine receptors, there is a high frequency of the sign "grew up without a father" (it is clear that here it was more about the sign of one of the parents, and not the subject under discussion, who, however, could inherit the predisposition ).

The choice is huge, but the more successful, wiser or wittier you choose a sign, the more information you will learn from experience. It is clear that you should not add the square root of the length of the nose to the length of the tail, since both lengths have the same dimension, and as a result you will get mathematical abracadabra. But if we add the cube root of the body mass to the length of the tail, then this makes more sense, because the mass depends on the cube of linear dimensions and, having extracted the cube root, we get a value commensurate with the length of the tail, and adding the two mentioned quantities, we get a certain measure linear dimensions.

It is easy to understand that not all signs from their infinite variety are equally informative. Some are equally informative, but add nothing to each other. For example, if we take two such signs: the length of the right leg and the length of the left leg, then it is even intuitively clear that although the two legs may differ slightly in length, the second will add little to the first. Take the following signs: the length of the left leg and height. What can we say about them? The greater the height, the greater the length of the legs - this is quite obvious. The height and length of the legs are correlated - no more, but no less. Indeed, if we take a sample of people, measure the height and length of the legs and calculate the correlation coefficient, then it will be quite close to unity and highly reliable. But we know that people are generally short-legged and long-legged. And if we take height and the ratio of leg length to height, we get two completely independent traits - linear dimensions and long legs, which can be inherited independently.

We have a ratio of two measured values. As a rule, working with many features immediately requires correct statistical processing. For such processing it is not very convenient to deal with relations. But there is a set of mathematical methods called multivariate statistics(in particular, principal component method for quantitative traits), which allows us to obtain N new traits from N of any traits that we have measured, which are linear combinations of the original ones (their sums with different coefficients) that will not correlate with each other. This means that each of them will carry independent information. And if we look at how N of these new features are composed, we will see that one of them reflects, for example, linear dimensions (this will include all the lengths of the body, arms, legs, etc.), the other - the thickness, the third – thickness unevenness (pronounced waist, hips, and bust), the fourth – the relative sizes of the head, the fifth – dark skin, etc. Such features are the most informative, and they have a different contribution to the overall variability of objects, which can also be assessed. However, multivariate analysis methods do not solve the problem of feature duplication, since duplication affects the mentioned relative contribution to the overall variability of the new feature in which they fall. This problem has not been solved in mathematical statistics so far.

Signs can be very different, but they fall into two large classes - quality, or alternative, and quantitative, or continuous. A trait is qualitative in the case when variability is manifested in the existence of several alternative variants of the trait, that is, in the belonging of an individual to a certain clear class, and its assignment to one of the classes is beyond doubt. For example, we can distinguish two such classes of human individuals as men and women. Women can also be divided into several alternative classes. Suppose a girl is dressed in trousers or her legs are dressed in a single cylindrical piece of matter - a dress or a skirt. We get two classes. The last case can be divided into two classes - dressed in a dress or in a skirt. We get three classes of women. Women can certainly distinguish many alternative classes regarding clothing and at the same time will not experience the slightest difficulty in classifying. Classical examples: pea flowers are white or purple, fruit fly eyes are again white or purple; funny, but both organs can also be pink, and this is another state of a qualitative trait, a separate class. In those cases where it is possible to distinguish qualitative (alternative) features, and individuals belonging to different classes (variants) are regularly found in nature, it is customary to talk about polymorphism, and the classes (variants) of these features are usually called morphs, or forms It is originally the same word, in Greek and in Latin, but the meaning of the second is too ambiguous, and it is better to avoid it. Etymologically, both words denote form, but as terms are used for any features, for example, those associated with color. Below are two morphs - with yellow and purple flowers, respectively - of the Altaic Violet, occurring in nature with approximately equal frequency.

https://pandia.ru/text/78/138/images/image002_73.jpg" width="283" height="311 src=">.jpg" width="347" height="453 src=">

Since we all went to school, we can suspect that the white and purple iris are homozygous for some alleles, and the lilac is heterozygous for these alleles. But we (in particular, I) do not have such information yet, and in any case we must start by ascertaining three color morphs.

We have mentioned three clear classes of pea flower color - white, purple and pink. But apple trees with purple petals grow on Zolotodolinskaya Street. And there are apple trees with pink, with slightly pinkish and white petals. In the case of carnations sold in stalls, it seems to us that the color of the flowers is a quality sign - there are red, white, pink and white petals with a red trim. And flower breeders probably have such a variety of carnations that the trait turns into a quantitative one. You can take a spectrophotometer, extract the anthocyanin pigment from a standard sample of petals and measure the intensity of the purple anthocyanin color, expressing it as a number. And then we get quantitative attribute- this is a sign that can be expressed as a real number. One and the same sign in different situations can act as both quantitative and qualitative. For almost any qualitative trait, you can find a way to measure it and thus consider it as a quantitative one. On the contrary, most quantitative characteristics cannot be considered as qualitative, since the values ​​of the measured parameter are rarely grouped into clearly distinguishable classes.

Human height (if we exclude obvious dwarfism) is a typical quantitative trait. How many growth options are there for a normal person? That's right, it's impossible to say - this is a positive real number, and the number of "options" depends on the accuracy with which we measure and what are the physical limits of this quantity. The height of many people can be characterized by its average value. But we also need some characteristics of its variability. To do this, we will have to study the frequency distribution of a quantitative trait. Another textbook example: if you take a lot of people, measure their height to the nearest centimeter and build them by height so that people with the same height stand in one column, we get the following picture: the length of the columns forms a kind of bell-shaped curve. With sufficient fractionation of measuring height and the number of people, it will reproduce well well-known in probability theory - normal or Gaussian distribution.

Dispersion" href="/text/category/dispersiya/" rel="bookmark">dispersion - the average square of the deviations of individual values ​​from the mean. The square root of this value gives standard deviation, its dimension coincides with the dimension of the measured quantity, and it can serve as a measure of the spread of a feature. About 70% of all normally distributed objects, no matter how many we measure them, lie in the range of values ​​from the mean minus the standard deviation to the mean plus the standard deviation. If this interval around the average is doubled, then there will be about 90%, if three times, then about 99% of the objects.

The central limit theorem of mathematical statistics states that the distribution of the sum of a large number of independent random variables approaches normal. And almost any quantitative trait is formed under the influence of a large number of multidirectional and different in strength factors (this is especially true for body size). That is why most of the quantitative characteristics obey the normal distribution.

However, this statement is true only in the first approximation. As is known, in order to assess the acceptability of the model, it is necessary to pay attention to the boundary conditions. The normal distribution is symmetrical and is given on the entire set of real numbers, from - to +, although the probability density falls off rather quickly when moving away from the mean. Let's return as an example to the sign "human height". Indeed, we do not have a hard upper limit on the height of a person, and no matter what record holder we find, there is never a guarantee that sooner or later a taller subject will not be found. But there is even a theoretical lower limit - after all, a person’s height, by definition, cannot be less than zero. This means that the boundary conditions do not allow the Gaussian model for human growth. Moreover, if we take a lot of people, we find that the distribution of their height is slightly asymmetrical and skewed to the right - the physical lower limit at zero makes itself felt! What model can we offer instead of Gaussian as more adequate for the quantitative features of biological objects?

Let's think about this. Signs are formed in the course of the individual development of the organism, which in fact is a very complex chemical reaction that occurs under the control of genes, which at certain moments provide certain concentrations of certain substances. These concentrations act as factors in the equations of rates that make up the individual development of reactions (for example, the Michaelis equations), and the values ​​of the signs directly depend on some of these (or even all) rates. Therefore, the individual contributions of individual genes to a quantitative trait usually do not add up, but are multiplied, that is, each gene increases or decreases the value of the trait by some times. The product of many independent random variables tends to lognormal distribution. As a result, the real distributions of quantitative traits of organisms are not normal, but log-normal. They are really very similar, but still somewhat asymmetrical - more gentle to the right.

https://pandia.ru/text/78/138/images/image007_23.jpg" width="304" height="416 src=">

Normal (A and B) and dwarf (C) peas

It is this trait - the relative length of the internode - that is here an alternative trait, while plant growth very rarely behaves like a true alternative trait.

There is another conditionally distinguished class of features, which you need to have a clear idea of. Let's take such a sign as the number of processes on the horns of a deer. The smallest horns are unbranched. In the maximum case, we have 10 processes on both horns. We will not experience any difficulty in assigning this or that horn to a class with a certain number of processes, and on this basis we can think that this is a qualitative feature. But the quality here correlates with an integer, and the number of classes, like a series of integers, is unlimited (no one can guarantee that sooner or later we will not come across a deer with 11 or more processes). Such signs are called countable; they are also called meristic, which can be confusing, since we do not need to measure here, but to count. In fact, there is a simple pattern here - the larger the horn, the more processes it has; just in order for the appendage to be added, the rudiment of the horn needs to gain some critical mass gain. So a countable number of processes is just a measure of the size of the horn. In the case of the number of cells on a dragonfly's wing, this becomes even more obvious. We get the same measure when measuring, when we stop at some of its accuracy. Imagine if we do not count the processes of the antlers of a deer, but the hairs on its young antlers. In fact, we have different measures of the size of the horn, but with different steps (rounding).

They operate with countable signs using the same approaches as with quantitative ones, with some features of mathematical processing. And it would be a mistake to apply to them the same approaches that are applied to alternative features. For example, one Moscow group of scientists studied the number of cells in certain areas of dragonfly wings. They counted the average number of cells, determined the mean and standard deviations, and, for example, found that these means were statistically significantly different in two different water bodies. They concluded that the populations at the two lakes are genetically specific, on the basis that alternative traits must necessarily be determined by hereditary factors, one or a few. But then they operated with their sign as with a quantitative one! Most likely, in one of the reservoirs, dragonflies developed under less favorable conditions and had a smaller wing area, which contained fewer cells, the size of which is rather standardized in ontogeny.

Finally, a third large class of features is often distinguished - rank features. We are talking about those cases when we can rank objects according to the principle "more" / "less" ("better" / "worse"), but we do not have a direct opportunity to express this quality of superiority of some over others numerically. Situations in which ranking signs appear are quite diverse. On the parade ground, we can easily build soldiers by height without measuring their height; in the same place, by shoulder straps, we easily recognize military ranks, knowing in advance in what order they are ranked relative to each other. In some cases, we are forced to subjectively evaluate some complex integral parameters, for example, the "strength" of individual plants, classifying them into "strong", "medium" and "weak".

It is curious that as soon as we have ranks, we already have a rough numerical measurement of a trait, albeit a very approximate or subjective one. Thus, ranks, being ordinal numbers, are themselves integers. And it is already possible to operate with them as with measurable features. With all the conventionality of such a "measurement", mathematical methods have been developed that make it possible to obtain very reliable conclusions on their basis. Moreover, even undoubted qualitative features can be treated quite approximately as quantitative ones. Suppose, if we have four color morphs, then we can consider them not as one qualitative trait, but as four quantitative traits, each of which can take two values ​​- 0 (the individual does not belong to this morph) and 1 (the individual belongs to given morph). Experience shows that such similar artificial "quantitative features" can be successfully processed.

As examples with the growth of peas show, the same trait can be both quantitative and qualitative. Any quality we distinguish can always be somehow measured (even belonging to the male and female sex can be measured as the ratio of certain hormones). The choice of how to operate with a sign - as a value of a numerical parameter or as an indicator of belonging to a class - is dictated by the specifics of a particular task. In the case of a bimodal distribution, it is useful to divide all individuals into two classes, at least as a first approximation, even if the two humps of the distribution merge and we cannot uniquely classify the individuals that fall between them, except by formally introducing a threshold value.

Both qualitative and quantitative traits can be inherited to some extent, and therefore, fall into the field of view of genetics. To analyze quantitative and qualitative traits, genetics uses different models. The inheritance of qualitative traits (it was with them that Mendel worked) is described in terms of combinatorics and probability theory in a simpler and more accurate way, and we will mainly deal with it. The inheritance of quantitative traits is described in terms of mathematical statistics and is based mainly on the analysis of correlations and decomposition into variance components. As mentioned above, inheritance of qualitative traits can also be treated as inheritance of quantitative traits, which in some cases turns out to be a very fruitful approach. I hope we will have time to briefly review the beginnings of quantitative trait genetics. In the meantime, a little more terminology.

Two no less broad concepts than a sign, without which, however, one cannot do - genotype and phenotype. These terms themselves, like the term " gene", introduced in 1909 by the Danish geneticist Wilhelm Ludwig Johansen. The phenotype is everything that concerns the characteristics of the organisms in question, the genotype is everything that concerns their genes. It is clear that there can be an infinite number of signs, and there are tens of thousands of genes. Moreover, no one registers the vast majority of traits, and no one knows the vast majority of genes. But the phenotype and genotype are working concepts, the content of which in each case is dictated by a genetic experiment. A genetic experiment usually consists in the fact that someone is crossed with someone, often over many generations, and they follow the signs of offspring, which can be selected, crossed, etc., in accordance with these signs. Or a sample of individuals is removed from nature , register their features, find out which variants are represented by some genes, observe the dynamics of their frequencies. In each case, we are looking for well-defined traits and genes, often a few. And when we talk about the phenotype, we mean the values ​​or states of precisely these traits, and when we talk about the genotype, then the set of these genes. There is a dependence of the first on the second, but, as we will see, not the most direct one. Genetics largely consists in elucidating this dependence. And only if the DNA sequence itself appears as a feature, the phenotype coincides with the genotype.

Only recently has it become possible to conduct high-tech experiments on tracking all known genes of those objects in whom they are known (for example, humans) - for example, by the presence or absence of all messenger RNA or all proteins in a particular tissue. The corresponding areas were named, respectively, "proteomics" and "transcriptomics", and the totality of all proteins or messenger RNA present in a particular object, respectively - proteome and transcriptome.

1.3. The concepts of "gene", "locus", "allele", "ortholog", "paralog", "mutation".

Based on our preliminary assertion that there is a lot of mathematics in genetics, we should expect terminological rigor in it. Unfortunately, this is also an empirical science that exists on a huge and heterogeneous experimental material, done by many scientists of different specializations (and different education!), And this led to the existence of various terminological "dialects" in the genetics, including in things very important. Let's move on to a concept that may seem central to genetics, but which in reality turned out to be too vague for this. Tell me what is gene? It's actually a very unlucky concept, so it now has multiple meanings. In classical genetics a gene is an inherited factor that affects the traits of an organism. Once it was considered as further an indivisible unit of heredity. After the discovery of the structure of DNA, it quickly became clear that many classical genes are sections of DNA encoding a certain protein, for example, an enzyme, which determines the inherited trait. This was a huge breakthrough in science, and on this wave it initially seemed that all the genes of classical genetics are just that. The following formula was developed: One gene - one polypeptide chain". It was proposed, in the original formulation "one gene - one enzyme", in 1941 (that is, 12 years before the decoding of the DNA structure by Watson and Crick) by George Beadle and Edward Tatham (you will find portraits of these and many other scientists in the textbook) who worked with neurospore mold strains that differed in their ability to carry out certain biochemical reactions and found that each gene is responsible for one specific biochemical reaction, that is, for a certain stage of mold metabolism. For these works they received the Nobel Prize in 1948. Note that at that stage the gene was still understood quite classically, but active research was carried out to find out what it physically represents. And after the discovery of the structure of DNA, everything seemed to fall into place and the genome began to be called the DNA segment encoding the polypeptide chain.

However, over time, it was found that next to the coding sequence there are always regulatory DNA sequences that do not encode anything themselves, but affect the on-off and intensity of transcription of this gene. You know them well: this is the promoter - the landing site of RNA polymerase, the operators - the landing sites of regulatory proteins, also enhancers- also sites for regulatory proteins that promote transcription, but are located at some, sometimes significant, distance from the coding sequence, and silencers- sequences that prevent transcription, etc. Sometimes they are located hundreds and thousands of nucleotides (on the scale of the chromosome, this is not so much), but still function as cis-factors (i.e., nearby), physically located nearby due to a certain DNA stacking. All this economy began to be considered as belonging to a gene that encodes something. Thus, in the molecular genetics of eukaryotes a gene is a coding region of DNA together with adjacent regions of DNA that affect its transcription.

For such a site in 1957, S. Benzer proposed a clarifying term cistron, which was also unlucky, since this term began to denote only the coding region of DNA (the so-called open reading frame), and sometimes the DNA region between the promoter and the terminator, from which a single RNA molecule is read. You remember that in prokaryotes, in which molecular genetic mechanisms began to be elucidated earlier, the operon organization of genes is widespread, when sequences encoding several polypeptide chains have common regulation and are read as part of a single mRNA. This does not allow the use of the above definition of the term "gene". On the other hand, the term "cistron" is of little use here: being defined as a DNA region from which a single RNA is read, it will include regions encoding several different proteins, which, on the other hand, was once called the "polycistronic principle of organization of genetic material." As a result, the use of the terms "gene" and "cistron" without explanation (at least what kind of kingdom in question) is currently fraught with misunderstandings.

Note that in the molecular biological sense, the gene turned out to be subdivided into parts - exons, introns, operators, enhancers, and finally - individual nucleotides. And the regulatory sequence of DNA, taken as such, has lost the right to be called a gene, since it does not encode anything itself. But due to the effect on the transcription of the gene, this sequence can also affect some trait (i.e., the phenotype) that will be inherited along with this sequence. And it can be separated by recombination from the coding sequence, especially if it is a remote enhancer. In other words, the regulatory sequence is also a special hereditary factor, which also has its own place on the chromosome. Some regulatory sequences, such as enhancers, can affect the transcription of several genes at once, i.e., occupy their specific place in the regulatory network that controls the development and functioning of an organism. There are all signs of a gene in the understanding of classical genetics.

This contradiction between the classical and molecular biological concept of a gene, which arose at a time when it seemed that all classical genes are transcribed sections of DNA encoding a protein or RNA, has not been overcome so far, which, however, is not particularly important, since the word "gene" has not been used as a strict term for a long time. In connection with the rapid development of molecular biology, the molecular biological wins: a gene is a transcribed section of DNA along with its regulatory DNA sequences. However, the classical concept that a gene is a hereditary factor (no matter how it functions, what it is and what it consists of) was historically the first, lasted more than half a century and turned out to be extremely fruitful. You need to be aware of this contradiction and learn to understand what is being said from the context.

In practice, this contradiction is resolved in two ways: either before using the word "gene" its meaning is preliminarily specified, or it is not used as a term. An example of the first case: in the section "materials and methods" in an article devoted to counting genes in the genome, it will be written by what criterion the gene was determined - for example, the number of open reading frames was counted. In the next article they will write: we analyzed the expression and showed that some of the potential reading frames found are never transcribed and, apparently, are not genes, but pseudogenes. An example of the second situation: a locus is being studied, from which several thousand proteins are made due to the fact that there are three alternative promoters, three alternative terminators and a dozen introns subject to alternative splicing. Where is the gene here and how many genes are at this locus? In this case, the word "gene" will be mentioned only in the introduction, as a synonym for the word "locus". If we take a phrase containing the word "gene" from the population genetic context and insert it into the molecular biological context, then we will get a loss of meaning.

Different variants of the same gene, in any sense, are denoted by the term alleles. In this form, the term was proposed by W. Johansen in 1926, on the basis of the term "allomorphic pair" introduced by W. Batson in 1902). The concept of "allele" appeared when nothing was known about the structure of DNA, and it was introduced precisely as an alternative version of the gene. This concept is especially important for diploid organisms, which receive the same set of genes from father and mother, and as a result, each of them is present in the genome in two copies, which may be identical or differ, but not to such an extent that it cannot be said that it is "same gene". These two copies are called alleles.

It's funny, but with regard to the term "allele", there is no unambiguous solution to such a simple question as the grammatical gender of this word in Russian. Moscow, as well as Kiev and Novosibirsk schools, believe that the allele is masculine, Leningrad (St. Petersburg) - that is feminine. You can see that even in the two recommended textbooks this word is used in different ways.

The term "alleles" was originally introduced to refer to variants of a gene responsible for a particular trait that are associated with the state of that trait. However, it turned out that genes independent of each other can influence the same trait in the same way. This raises the problem of distinguishing between alleles of the same or different genes. Fortunately, even earlier it became clear that genes are located in a strictly defined sequence in linear structures - as it turned out, in chromosomes - so that each gene occupies a strictly defined place on one of the chromosomes. Therefore, each gene could be identified not only by its influence on the trait, but also by its place on a particular chromosome. It turns out that each place on the chromosome responsible for some trait - locus- is occupied by one of the alleles - individual variants of the gene. The diploid nucleus contains two alleles of each locus, obtained from mother and father, different or identical. Locus can be defined as position on a chromosome occupied by a certain hereditary factor, a allele- how variant of a certain hereditary factor, and since it is the locus that gives certainty to the hereditary factor, but the allele is variant of a hereditary factor located at a particular locus. Obviously, this definition is given from the point of view of classical genetics. In this case, it is better to say "locus on the chromosome." and not the “locus of the chromosome”, because in the second case it may seem that the chromosome is composed only of such loci that have genetic meaning. Although the gene in the classical sense really corresponds to a certain segment of the DNA of the chromosome, and although very often coding segments of DNA can affect something at least indirectly (for example, the presence of a block of repeats can contribute to the compaction of chromatin and thereby affect the intensity of transcription of coding segments DNA located even at a considerable distance from it), nevertheless, there certainly exist extended stretches of DNA that do not have any genetic content, that is, they do not affect anything and are not genes in any sense.

But the terms 'locus' and 'allele' also have a funny expansive meaning. If we study the DNA sequence itself, which in this case is both our trait and our gene, since it literally encodes itself, we can call any part of it that can be recognized in any way as a locus, and its variant as an allele. For example, in the genome there are so-called "microsatellites" - sequences of very short, consisting of two or three letters, tandem (arranged one after another) repeats. The number of these repeats changes very easily due to mechanisms associated with slippage during replication or incorrect recombination. Actually, due to these mechanisms, they “start up” in the genome, while they have no function of their own and they are not genes in the molecular sense. Due to their high variability, microsatellites like to study evolutionary genetics - because the number of copies of repeats can be used with a certain degree of certainty to judge relationship. So, in this case, it is also customary to talk about alleles, denoting by this word sequences of microsatellites of different lengths (that is, with a different number of copies of repeats).

It turned out that the word "gene" in classical genetics can be abandoned altogether. There is a locus - a place on the chromosome, which is always occupied by one of the alleles. The relationship between a locus and an allele is the same as the relationship between a variable and its value. Moreover, in accordance with the classical definition, both a locus is a gene (as a generic concept), and an allele is a gene (as an individual concept). You can often hear "these genes are non-allelic to each other", that is, they talk about allelic and non-allelic genes, that is, about alleles of one locus and alleles of different loci. In the practice of genetics, a not very rigid tradition has been established to use the word "gene" as a synonym for the word "locus", and such examples will also be found in our text.

But there are situations when the word "gene" is difficult to avoid. For example, they treated peas with red flowers with a chemical mutagen and got peas with white flowers. It was established that the trait "flower color" is inherited as determined by one locus - in such cases it is customary to talk about monogenic sign (although the non-existent term "monolocal" would be more accurate). However, white-flowered peas have already been known and this trait is determined by an allele of a well-known locus. The question is, did we get the same allele at the same locus, or a different (at the DNA sequence level) allele at the same locus, which, however, also leads to white flowers? Or an allele of a new, previously unknown locus - which may, say, for a completely different stage of pigment synthesis? Until this is established, one has to lazily say: "We got the white-flowering gene." By the way, a real situation from the life of our laboratory is described - we received a gene that determines white-flowering, which turned out to be allelic to a not widely known locus responsible for the anthocyanin color of a flower a, but to a little-known locus a2 .

The terms locus and allele can also be applied to a gene in the molecular genetic sense - namely, to a specific nucleotide sequence. Here the meaning of the terms "locus" and "gene" is the same, and allele will mean specific nucleotide sequence of a given gene. However, within the framework of molecular genetics, the need for these terms does not arise very often, since molecular biological consideration is usually diverted from the existence in a diploid organism of a second such gene, with an identical or slightly different sequence, in a homologous chromosome.

You probably know from molecular biology about the existence multigene families: when in the genome there are several genes in the molecular sense that encode a protein product of the same type - the same enzyme, for example. Moreover, they may differ somewhat in the primary structure: both DNA and the protein product, as well as in some physicochemical properties of the protein product - the intensity of the molecular function, as well as in the features of expression - that is, the place, time and intensity of synthesis. The same pea has seven genes (in the molecular sense) of histone H1, each of them encodes a special variant of the molecule, one of which is present only in actively dividing cells and disappears from the chromatin of cells that have completed division. Any sequence of any of these genes would be a variant of the H1 histone gene. But within the same genome, these seven genes occupy different loci, so only different variants of a particular locus will be alleles. You must be familiar with the concept homology- similarity based on common origin, and homologues- objects that have such a similarity. In molecular genetics, two types of gene homology are distinguished. Homologous but non-allelic genes in the same haploid genome occupying different loci are called paralogs(from the Greek "para" - near, near). Individual variants of the same locus in different individuals are called orthologs(from the Greek "ortho" - directly, opposite; remember the ortho-para isomers in organics). Basically, orthologs are alleles. However, the term "ortholog" is usually used by molecular biologists when studying the genes of different species - in cases where it can be unambiguously established that they have the same locus, while the term "allele" is used only for a gene variant in the same of the same species, or in closely related species that are nevertheless capable of interbreeding (for example, wheat and its wild relatives). Thus, an allele is a genetic concept; alleles are spoken of when, in principle, they can participate in crossing.

Let's ask ourselves a question - where did paralogs come from? It is logical and correct to assume that they arose as a result of gene duplication - that is, rare cases of "reproduction" of a gene in the genome. Naturally, any such event, however rare, occurs within the limits of any one species. As a result, we have a situation where some individuals of the same species have two loci in the genome that are identical in their primary structure (it can accumulate differences over time), while others have only one. Let us assume that two copies of the propagated gene are located side by side, so that both new loci are located in the same place where one old one was located. And so they begin to accumulate differences. Where and what are alleles here? We have considered the situation when the concept of "allele" fails, and this is very good, since in doing so we traced the limit of its applicability.

By the way, an unexpectedly non-trivial question is what are different and identical alleles. In the early stages of the development of genetics, alleles were recognized only by phenotype, and only those that lead to different phenotypes were considered different alleles. Most often, there were two alleles - normal and defective (mutant), so that in the early stages of the development of genetics, the "presence-absence theory" (of a certain function) was popular. However, as genetics developed, more and more cases became known when the same trait has several heritable variants, which ultimately led to the well-known aphorism of Thomas Morgan: "One presence cannot correspond to several absences." And in the case of quantitative traits determined by many genes at once, there is no special phenotypic manifestation of a single allele at all. As a result, they settled on the fact that alleles were considered to be obviously different if in this experiment they were not obviously inherited from the same individual, that is, they were not identical in origin or such identity was not established. For example, we catch one hundred seemingly identical individuals in nature in order to study the small nuances of the phenotypic manifestation of a certain gene, cross them with special tester lines, transfer the studied gene obtained from them to an identical gene background, measure the trait of interest to us - and at the same time we believe that one hundred different (by origin) normal (!) alleles participate in the experiment (all of them are obtained from nature from viable individuals).

You understand that when it became possible to decipher the primary structure of the studied genes, the question of the identity of alleles ceased to be theoretical and was reduced to the identity of their primary structure (nucleotide sequence). If there is at least one replacement, the alleles are different; if not, they are the same, since they are completely identical molecules. Given the possibility of accumulation of nucleotide substitutions, many of which do not affect the function of the locus, in practice this approach differs little from a priori considering any alleles independently obtained from different individuals to be different. However, the rate of occurrence of substitutions varies greatly from locus to locus - for example, in some loci we observed an identical nucleotide sequence even in alleles obtained from different pea subspecies (wild and cultivated).

Let's touch on such non-strict, popular terms as "wild-type alleles", "mutant alleles" and "null alleles". The above "theory of presence-absence" in many cases is quite applicable. Let's take the same peas for example. Pea flowers have a pigment - anthocyanin, which colors them in pink-red (purple) color. If any of the proteins involved in the biochemical chain of anthocyanin synthesis is defective or absent, anthocyanin is not synthesized and the flowers remain white. Suppose there is a locus in a certain chromosome, let's denote it a, which contains the DNA sequence that codes for one of these proteins. Usually they say less strictly, but more simply - in a certain chromosome there is a gene a, which encodes one of these proteins (Peas do have such a gene with this designation and encode a regulatory protein that binds to DNA, and not an enzyme involved in the synthesis of anthocyanin). Let this gene have two alleles, let's denote them A and a. allele A encodes a normal functional protein. allele a does not encode a functional protein. How this is possible - we will talk later, for us now it is important that this allele simply "does not work" - does not fulfill its molecular function, even if it is unknown to us. In such cases, the normal allele is called wild type/ On the example of peas, this term is doubly correct. Peas are both cultivated and wild (representatives of the same species continue to exist in the wild). And all wild peas have purple flowers, while cultivated ones have both purple and white flowers, but white ones predominate in vegetable and grain varieties of European selection. For an allele that is not capable of forming a functional protein product, the term is often also used. null allele.

There are cases when the concept of "wild type" or "null allele" is not applicable. For example, in a two-pointed ladybug Adalia bipunctata There are two forms - red with black flecks and black with red. (By the way, this is one of the classic objects of population genetics, introduced into this science by Timofeev-Resovsky.) Both are represented in the European part of Russia, none is better than the other (in Novosibirsk, however, only the second is found). None of them can be called wild type in contrast to the other. However, it is possible that one of these alleles is associated with the loss of the molecular function of the protein product of this locus, which, like other genes of individual development, is likely to be a factor influencing the expression of other genes.

Then there is a popular term in genetics - mutation. Historically, the concept was introduced by Hugo De Vries in a sense approaching that which now exists in horror films - a sudden change in hereditary inclinations, leading to a radical change in the phenotype. De Vries worked with one of the types of primrose ( Oenothera), which, as it turned out later, has a highly original cytogenetics: due to multiple chromosomal rearrangements, the entire genome is inherited as one allele. However, the word has become a widely used term, and not just in Hollywood. Sergey Sergeevich Chetverikov, one of the founders of population genetics, used the term "genovariation", which is more correct, but did not take root (although Chetverikov was one of the domestic geneticists who had a significant impact on world genetics, actually founding population genetics). Currently under mutation understood any change in the primary structure of DNA- from the replacement of one nucleotide to the loss of huge parts of chromosomes. I would like to draw your attention to the fact that the word "mutation" refers to the event of change itself. However, in a non-strict but tenacious genetic practice, the same word "mutation" is often applied to its result, that is, to the allele that arose as a result of the mutation. They say: “Drosophila participate in the experiment - carriers of the mutation white". No one recorded the mutational event itself, which led to the emergence of this classic mutation - by the way, it is associated with the insertion of a mobile genetic element into the enzyme gene copia, which moves exceptionally rarely - but everyone keeps saying "mutation" instead of "mutant allele". It is understood that once there was a mutation that spoiled the normal allele, resulting in a mutant one. It is easy to understand that "mutant allele" is also the antonym of the expression "wild-type allele", but wider than "null allele", as it allows various deviations from the wild-type allele, as leading to a complete loss of molecular function (the same " several absences!), and not leading.

There is another very nasty terminological situation that some of you will have to deal with in human genetics. As we shall see later, human genetics in general, terminologically, deviated quite a lot from general genetics. The reason is that, on the one hand, this specialized field of science belongs both to biology and medicine and is purely institutionally isolated from all other genetics, and in this sense it boils in its own juice. On the other hand, due to its practical significance, this area is very large in terms of volume - the number of researchers and their studies, journals, articles - which makes its internal traditions resistant to external influences, including those from the "mother" general genetics. Modern human genetics has advanced so far that in many cases it has realized the age-old dream of geneticists, namely, it turned out to be able to associate certain signs (including pathological ones) with the presence of certain nucleotides in specific positions of specific genes. But it was here that an unfortunate terminological substitution occurred. When we compare many alleles in relation to the primary structure of DNA, it turns out that in some positions there is always the same specific nucleotide, and in some positions nucleotide substitutions are possible. (There is a suspicion that in the genomes of all people of mankind you can find any nucleotide in any position, which raises a funny philosophical question - what is the human genome). They were correctly named. polymorphic positions- and indeed, each such position exhibits alternative variability - that is, polymorphism - in relation to which of the four nucleotides it can be occupied. But here, somehow, there was a substitution of concepts. "Polymorphism" came to be called a specific nucleotide at a specific polymorphic position (what should be called a "morph"). They began to say something like this: “We sequenced such and such a gene in so many people and found twelve polymorphisms, two in positions such and such, six in such and such, and four in such and such. Two of the polymorphisms in such-and-such a position showed a significant association with the syndrome of such-and-such. Most likely, such a substitution took place at the level of laboratory slang, which exists in any scientific work and consists in simplifying terminology, often illiterate. Students who come to the lab sometimes mistake slang for terminology and begin to use it in all seriousness. At some point, it happens that both the author of the article and the reviewers in a scientific journal are used to the same slang, then it penetrates the scientific press and, with some probability, is fixed. (The picture, by the way, is more than familiar from population genetics and completely replicates the process of speciation - when random occurrences occur in an isolated population, they coincide in different sexes and anomalies in the recognition system of suitable sexual partners are fixed, which become the norm in a new species and lead to its non-crossing with the old .) In addition to the etymological contradiction (one single morph is called a word indicating that there are many morphs) and bad taste, such a substitution also has the consequence that researchers using this jargon have deprived themselves of the term "polymorphism" in its correct meaning. And when it becomes necessary to express the corresponding concept (which has not disappeared), instead of an unambiguous term, they have to resort to verbose descriptions. Suppose, in situations for which the term "balanced polymorphism" exists - when one of the morphs has an advantage in some conditions, the other - in others, so they coexist and do not crowd out one another - they always have to resort to long descriptions like the one above.

In terms of introducing you to the traditional and not always consistent genetic terminology, it is necessary to mention a rather funny term marker. This term was introduced for loci that are important to us not in themselves, but insofar as they mark a certain region of the chromosome. The appearance of such a term was associated with a long period of time when not very many genetic loci were known. It was needed in situations where it was necessary to stake a newly discovered gene or, paradoxical as it may sound, to work with genes that have not yet been discovered. For example, the nature of the genes that control economically valuable quantitative traits of plants and animals was completely unknown for a long time, and even now little is known about them. At the same time, there was no doubt that these genes exist and are located on the chromosomes. Manipulating with known loci - markers - it was possible to identify regions of chromosomes with which certain effects on quantitative traits are associated, and use them in breeding work. Initially, these were mostly "visible markers" - loci that had alleles with a visible effect. However, in the future, this approach was seriously developed due to the involvement in genetic analysis of biochemical traits (as a rule, also not functionally related to economically valuable traits), and later due to the emergence of the opportunity to work with the polymorphism of chromosome DNA itself. This led to the emergence of the concept of "molecular marker". Thus, the term "marker" is only a synonym for the term "locus", but emphasizes that this locus is of interest to us not as such, but only as a landmark on the chromosome. However, the term became so accustomed that it began to be used in cases where the locus is a directly studied object. Paradoxically, in molecular phylogeny studies, the analyzed sequences themselves are also commonly referred to as markers. Here it could be implied that they are just landmarks in time and nucleotide substitutions in them mark evolutionary events, which, of course, are not limited to changes only in the analyzed sequences.

Genes (more precisely, loci) are usually denoted by abbreviations consisting of Latin letters, as well as numbers. However, behind these designations are the full names of genes, Latin or, more often, English. Both full names and abbreviations for genes are always written in italics. For genes with visible expression, this is usually a word describing the mutant phenotype: wwhite(white eyes of a fly), yyellow(yellow body in a fly), aanthocyanin inhibition(for peas) opovula pistilloida(for peas) bthbithorax- not a very good name for a mutation in Drosophila, in which a second pair of wings appears on the metathorax (metothorax) (as on mesothorax) - but it is written as if the thoracic tagma had doubled. There is even a Drosophila mutation with an official name fushi tarazu(abbreviated symbol - ftz) - Japanese. Cheerful Americans named one of the genes mothers against decapentaplegic, by analogy with organizations such as "mothers against the war in Iraq" - in female fruit flies, carriers of this mutation, descendants carrying the gene will not survive decapentaplegic. The abbreviation for this gene sounds just as good: Mad. Occasionally, and not in the most popular objects, the official name of the gene and its abbreviation are not related to each other: the mutation that turns the tendrils of peas into leaves has the designation tl(from tendrilless), and the title is clavicula. If a gene is known by its molecular product (protein or RNA), then this gene itself will be named after its product: mtTrnKmitochondrial transportation RNA for lysine, Rbclribulose biphosphate carboxylase large subunit. It is important that each species has a completely independent official nomenclature of gene symbols, which leads to some difficulties at the present time, when the number of objects with developed private genetics has increased, and the number of objects in which genes are studied not by genetic experiments, but by direct reading of DNA sequences – grows like an avalanche (for example, the project “10,000 Vertebrate Genomes” is already in operation).

Genetics began with cases where only two alleles were known at each locus and it was possible to distinguish them by writing with a capital or small letter, which was initiated by Mendel. A capital letter was used for the dominant allele (you know what this means from school, we will touch on the phenomenon of dominance in more detail later) - this is usually the wild-type allele; as we would now say - an allele with a normal, unimpaired molecular function. At the same time, the locus was designated with a small letter, that is, its designation coincided with that of a recessive, that is, mutant, non-functional allele, because it was by the existence of such an allele that scientists first learned about the existence of the locus. In rare cases, when the mutant allele turned out to be dominant, both it and the locus itself were designated with a capital letter.

When, and very soon, it became clear that there are many alleles in a locus (now we know that there are a lot of them), allele designations were introduced, which are written in a superscript after the designation of loci. The “+” symbol is often used as such an index for the wild-type allele, sometimes there is no index. Suppose, at the very first known Drosophila locus white (w) the wild-type allele is denoted w+ , the allele responsible for white eyes w, and responsible for apricot - wa (full name - whiteapricot).

I draw your attention to the fact that for traditional genetic objects with developed private genetics, different traditions still coexist in writing the designations of loci and their alleles. So far I have found three of them:

Loci with visible manifestation are written with a small or capital letter, depending on whether the locus is described by a recessive or dominant allele in relation to the wild type; and capitalized if the locus is known from molecular function. At the same time, for loci with visible manifestation and dominance, the tradition is preserved to write recessive alleles with a small letter, and dominant alleles with a capital letter. Such is the genetic nomenclature, for example, in peas and mice. For example, the pea locus a, responsible for the color of the flowers has alleles A and a.

As in the previous case, but the capital and small letters in the designation of the locus and its alleles are rigidly fixed. Such a system is used in Drosophila. Here the designations w and W belong to completely different loci. white and Wrinkled. The wild-type allele is always denoted by the index "+" here. (It is curious that Drosophila and mouse geneticists, who are accustomed to the system adopted by their subjects, are usually not even aware of the existence of another system for naming loci.)

All letters in the designations of loci are always capital. Such a system is now used in human genetics, and it has been adopted quite recently.

The same allele designations are used for phenotypes, but always without italics. So, if you describe the results of an experiment in which you observed so many pea plants with purple flowers and so many with white flowers, and you know that white-flowering in the experiment is associated with the locus a, then you will designate purple-flowered and white-flowered plants with the letters A and a in the table of occurrence, even if you do not know their genotype. The same is done if you determine the presence of electrophoretic variants of some isoenzyme: there the correspondence of the phenotype to the genotype is greater, but even it is not always unambiguous.

1.4. The concepts of "homozygote", "heterozygote", "hemizygote".

In each diploid organism, each chromosome (except for the sex chromosomes) is represented in two copies - homologues received from the father and mother, respectively. Each of the homologues has the same set of loci, and in each of the homologues, each locus is occupied by some allele. Therefore, each diploid organism carries two alleles of each locus. When recording its genotype, the designations of the two alleles present in the locus (locuses) of interest to us are written in a row, for example, if there is in the locus a pea alleles A and a There are three possible genotypes: A A, A a and a a.

If in both homologues the locus is represented by the same allele, then the individual is said to be homozygous at this allele, or at this locus. Moreover, when they say that they are homozygous for a locus, the emphasis is on the fact that in both homologues there are no differences in it, when they say that they are homozygous for an allele, the emphasis is on which allele. If in both homologues the locus is represented by different alleles, then the individual heterozygous for this locus. For simplicity, homozygous and heterozygous individuals are called respectively homozygous and heterozygous. Considering what was said above about the identity/differences of alleles, true homozygotes in nature are not very common. However, in a particular experiment, no one bothers to ignore the differences that are not detected or cannot be detected in this experiment, and to consider individuals in which both copies of the locus have an identical phenotypic manifestation as homozygotes. In studies involving related individuals, known homozygotes are encountered - individuals in which both alleles of some locus are identical in origin. Such studies often use the notion average heterozygosity is the proportion of heterozygous loci among all loci.

Let's add another term hemizygote- this is an individual in which not two, but only one allele is obviously present. Well, for example, you probably know that men have only one sex X chromosome, and the second sex chromosome, the Y chromosome, is not homologous to it (with the exception of small areas), since it is not devoid of most regions saturated with genetic information. Therefore, alleles from those regions of the X chromosome that are not represented on the Y chromosome do not have homologues in the nucleus, that is, they are in the hemizygote. Sometimes a chromosome loses some of its fragment along with the genes (or one gene) in it. In this case, the alleles of these genes in the homologous chromosome are also in the hemizygote. However, in a genetic experiment, we often do not know what happened in the chromosomes, and we judge genes only by the phenotype. In this case, the absence of a gene may not differ from its "breakdown" - the loss of its function. And until we know, let's say, the molecular background, but somehow conclude that the molecular function is lost, we will just talk about the allele, or "null allele."

Distinguishing between homozygote, heterozygote, and hemizygote can be important in diploid organisms because dose the corresponding allele in the genome in this case differs by half (for example, in the case of a locus in the X chromosome, two copies per genome in women and one in men), which may be important. Molecular genetics usually digresses from the homozygosity/heterozygosity of its subjects. However, the concept is often used here. gene doses, that is, the number of alleles with unimpaired molecular function in the genome - usually it varies from 0 to 2, but can be increased by gene modification, that is, artificial introduction of additional copies into the genome.

In the case of haploid organisms, it is customary to say that in general all alleles of all genes are in the hemizygote. What kind of haploid organisms do we have? Prokaryotes, lower fungi and ascomycetes, plant gametophytes. Let's note one detail - haploids are not those who have strictly one haploid genomes in a cell. In most bacterial cells, there are several nucleoids that have not yet had time to separate - but they are all identical (up to de novo mutations). In lower fungi, the hyphae are often not subdivided into individual cells at all. It is important that a haploid organism has a single variant of the haploid genome in its cells. Finally, some animals - such as Hymenoptera - have a haploid sex - you probably know that bee drones are haploid. At the same time, in somatic cells, the set of chromosomes doubles, from which they do not cease to be haploids. Mitochondria and plastids are more often inherited only from the mother, so the cells are hemizygous for the genes located in the genomes of these organelles. However, in many plants, plastids sometimes have biparental inheritance, in others this happens occasionally, and paternal mitochondria also penetrate the zygote extremely rarely. In such cases, the offspring receives from both parents a certain varying proportion of these organelles, not necessarily equal to 1/2. In such cases, it is customary to speak of heteroplasmy.

LOCUS GENE

the location of a particular gene on a chromosome.

Dictionary of foreign expressions. 2012

See also interpretations, synonyms, meanings of the word and what is LOCUS GENE in Russian in dictionaries, encyclopedias and reference books:

  • LOCUS GENE in the New Dictionary of Foreign Words:
    (lat. locus place) the location of a particular gene in ...
  • LOCUS in Medical terms:
    (lat. locus place, position) the location of the gene in the chromosome, plasmid or other genetic structure of the cell; Sometimes the term is used to mean...
  • LOCUS in the Big Encyclopedic Dictionary:
    (lat. locus) the location of a particular gene on the genetic map ...
  • LOCUS in the Great Soviet Encyclopedia, TSB:
    (from Latin locus - place) chromosomes, a linear section of a chromosome occupied by one gene. With the help of genetic and cytological methods, it is possible to determine ...
  • LOCUS in the Encyclopedic Dictionary:
    a, m., biol. The location of a particular gene in ...
  • LOCUS in the Big Russian Encyclopedic Dictionary:
    LOCUS (lat. locus), location determined. gene on the genetic map...
  • GENA in the Dictionary for solving and compiling scanwords.
  • LOCUS in the Dictionary of the Russian Language Lopatin:
    l'okus, ...
  • LOCUS in the Complete Spelling Dictionary of the Russian Language:
    locus...
  • LOCUS in the Spelling Dictionary:
    l'okus, ...
  • LOCUS in the Modern Explanatory Dictionary, TSB:
    (lat. locus), the location of a particular gene on the genetic map ...
  • ENZYME DEFICIENCY in the Medical Dictionary.
  • PREMATURE in the Medical Dictionary:
  • ALBINISM in the Medical Dictionary:
    Albinism is a congenital deficiency or absence of pigment in the skin, hair, iris and retina, or only in the iris of the eye for ...
  • ENZYME DEFICIENCY
  • MEANING in the Dictionary of Analytical Psychology:
    (Meaning; Sinn) - a quality attributed to something and giving it a certain value. “The question of meaning was central to Jung as a person, a doctor, a psychotherapist, ...
  • ALLEL in the Encyclopedia of Biology:
    one of the possible structural variants of the gene. Alleles (allelic genes) are located in certain regions of homologous chromosomes and determine the development of one of ...
  • GENE DISEASES in the Encyclopedia of a sober lifestyle:
    - diseases caused by gene mutation, i.e. a change in the sequence (loss, rearrangement, insertion, etc.) of the nucleotides of the DNA molecule, which leads to a violation ...
  • THROMBOCYTOPENIA in the Medical Dictionary:
  • POST-PHLEBITIC SYNDROME in the Medical Dictionary:
  • WARDENBURG SYNDROME in the Medical Dictionary.
  • SCLEROSIS TUBEROSIS in the Medical Dictionary:
  • in the Medical Dictionary:
  • MAMMARY CANCER in the Medical Dictionary:
  • DYSPLASIA ECTODERMAL in the Medical Dictionary:
    Ectodermal dysplasia - congenital defects in structures of ectodermal origin (including the skin and its appendages) - are observed as several independent ...
  • DIABETES INSULIN-INDEPENDENT DIABETES in the Medical Dictionary.
  • cystic fibrosis in the Medical Dictionary.
  • MYOTONIA CONGENITAL in the Medical Dictionary:
    Congenital myotonia is a disease associated with mutations in the CLCN1 gene encoding the muscle chloride channel (*118425). This group includes diseases associated with ...
  • GENDER DISORDERS in the Medical Dictionary.
  • in the Medical Dictionary:
  • ASTROCYTOMA in the Medical Dictionary:
    Astrocytoma - a class of glial tumors of the brain and spinal cord derived from astrocytes; grow infiltratively, clearly not delimiting from the brain tissue. …
  • PORPHYRY in the Medical Dictionary:
    Porphyria - hereditary or acquired (as a result of exposure to chemical agents) defects in the genes of enzymes involved in the biosynthesis of the topic. Porphyrias are classified into...
  • POLIO in the Medical Dictionary:
  • HIRSHSPRUNG DISEASE in the Medical Dictionary.
  • PELGETS DISEASE in the Medical Dictionary:
    Paget's disease is a hereditary disease characterized by deformity of the femur and tibia, spine and skull with severe hyperostosis, thickening and curvature ...
  • PEROXIS DISEASES HERITABLE in the Medical Dictionary:
    In diseases of peroxisome accumulation, the synthesis of plasmalogens is insufficient, the organization of these organelles is disturbed, or peroxisomes are completely absent. Biochemical classification. Inherited (all p) ...
  • AGIRIA in the Medical Dictionary:
    Agyria is a developmental defect in the form of a weak expression of the convolutions of the cerebral cortex due to a violation of the migration of neuroblasts in embryogenesis. Agyria I ...
  • THROMBOCYTOPENIA
    Thrombocytopenia is a low platelet count in the peripheral blood, the most common cause of bleeding. With a decrease in the number of platelets less than 100x109/l, the time is lengthened ...
  • SCLEROSIS TUBEROSIS in the Medical Big Dictionary:
    Tuberous sclerosis is a hereditary disease with a wide clinical spectrum of manifestations and multiorgan involvement. Phakomatosis. These include: tuberous sclerosis, neurofibromatosis, ...
  • SCLEROSIS AMYOTROPHIC LATERAL in the Medical Big Dictionary:
    Amyotrophic lateral sclerosis (ALS) is a chronic progressive neurodegenerative disease characterized by damage to the motor neurons of the brain and spinal cord and their degeneration.
  • POST-PHLEBITIC SYNDROME in the Medical Big Dictionary:
    Post-phlebitic syndrome is a combination of signs of chronic functional insufficiency of veins, usually of the lower extremities (edema, pain, fatigue, trophic disorders, compensatory varicose veins).
  • WARDENBURG SYNDROME in the big medical dictionary.
  • MAMMARY CANCER in the Medical Big Dictionary:
    The incidence of breast cancer has increased significantly over the past 10 years: the disease occurs in 1 in 9 women. The most common location...
  • POLIO in the Medical Big Dictionary:
    Poliomyelitis is an acute viral infectious disease that occurs with damage to the motoneurons of the gray matter of the spinal cord, the nuclei of the motor cranial nerves of the brain stem ...
  • INSUFFICIENCY OF PLASMA COAGULATION FACTORS in the Medical Big Dictionary:
    Plasma coagulation factors are various components of plasma that implement the formation of a blood clot. Plasma clotting factor deficiencies can be isolated or ...
  • PREMATURE in the Medical Big Dictionary:
    Prematurity is the state of a fetus born before the end of the normal period of intrauterine development (before the expiration of 37 weeks of gestation), with a body weight of less than ...

05.05.2015 13.10.2015

The terms alleles, loci, markers are widely used in modern genetic science. Meanwhile, the fate of the child often depends on the understanding of such narrow terms, because the diagnosis of paternity is directly related to these concepts.

Human genetic feature

Each person has their own unique set of genes, which they receive from their parents. As a result of a combination of a set of parental genes, a completely new, unique organism of a child with its own set of genes is obtained.
In genetic science, modern researchers have identified certain areas of human genes that have the greatest variability - loci (their second name is DNA markers) for diagnostics.
Any of these loci has many genetic variations - alleles (allelic variants), the composition of which is purely unique and purely individual for each person. For example, the hair color locus has two possible alleles, dark or light. Each marker has its own individual number of alleles. Some markers contain 7-8, others more than 20. The combination of alleles for all the studied loci is called the DNA profile of a particular person.
It is the variability of these sections of genes that makes it possible to conduct a genetic examination of kinship between people, because a child from his parents receives one of the loci from each parent.

The principle of genetic testing

The genetic procedure for establishing biological paternity helps to establish whether the man who considers himself the parent of a certain child is the real dad or this fact is excluded. For the examination of biological paternity, the analysis compares the loci between the parents and their child.
Modern methods of DNA analysis are able to simultaneously study the human genome at once at several loci. For example, a standardized gene study included the examination of 16 markers at once. But today, in modern laboratories, expert research is done on almost 40 loci.
The analyzes are carried out using modern gene analyzers - sequencers. At the output, the researcher receives an electrophoregram, which indicates the loci and alleles of the analyzed sample. Thus, as a result of DNA analysis, the presence of certain alleles in the analyzed DNA sample is analyzed.

Determining the Probability of Relationship

To determine the level of relationship, the DNA profiles that were obtained for a particular participant in the examination are statistically processed, based on the results of which the expert draws a conclusion about the percentage probability of a relationship.
In order to calculate the level of relationship, a certain statistical program compares the presence of the same allelic variants of all the studied loci from those analyzed. The calculation is carried out between all participants in the analysis. The result of the calculation is the determination of the combined paternity index. The second indicator is the probability of paternity. The high value of each of the determined values ​​is evidence of the biological paternity of the examined man. As a rule, a database of allelic frequencies obtained for the population of Russia is used to calculate kinship indicators.
A positive comparison of 16 different, randomly selected DNA markers allows, according to statistics, to determine the likelihood of paternity. However, if the results for alleles of 3 or more markers out of 16 do not match, the result of the examination of biological paternity is considered negative.

Accuracy of examination results

Several factors affect the accuracy of genetic testing results:
the number of analyzed genetic loci;
the nature of the locus.
Genetic analysis of as many loci as possible that are unique to a particular person makes it possible to more accurately establish (or, conversely, refute) the degree of probability of paternity.
Thus, the achievable degree of probability in the simultaneous analysis of up to 40 different loci is up to 99.9% to confirm the probability of biological paternity, and up to 100% if a negative result is obtained.
Determination of biological paternity with a degree of probability of 100% is impossible due to the theoretical possibility of the existence of a man with the same set of DNA markers as the father of the child. However, at a probability level of 99.9%, the examination is considered positive, and paternity is proven.

What DNA sources are suitable for analysis?

DNA testing is a highly sensitive procedure that does not require large amounts of sample for DNA extraction. Thanks to modern scientific advances, genetic testing to determine the likelihood of paternity can be carried out using both biological material obtained from a specific person (a swab from the mouth, hair, blood) and non-biological material, that is, only in contact with a person (for example, his toothbrush , garment, baby pacifier, kitchen utensils). This is possible due to the fact that in all human cells, regardless of their origin, the DNA molecules are exactly the same, which makes it possible to compare DNA samples obtained from the patient's mouth with a sample from blood, or from a DNA sample obtained from a toothbrush or clothing.

New Advances in Paternity Determination

A new word in the definition of paternity was the development of microchip diagnostics. Thanks to the indication on the microchip (small plate) of almost all human genes, determining paternity will not be difficult. This technology is similar to a genetic "passport". By taking a sample of blood or amniotic fluid from the fetus, it will be possible to easily extract DNA from it and conduct hybridization on the microarrays of the parents. The researchers plan to use this technology to also detect hereditary diseases.

In pig breeding in Europe and America, genomic selection is beginning to be used. Its technologies make it possible to decipher the genotype of pigs already at birth and select the best animals for breeding. This latest technology is designed to further increase the breeding accuracy and reliability of the breeding value of pigs.

The ancestor of genomic selection is marker selection.

Marker selection is the use of markers for marking genes of a quantitative trait, which makes it possible to establish the presence or absence of certain genes (gene alleles) in the genome.

A gene is a section of DNA, a certain sequence of nucleotides, which encodes information about the synthesis of one protein molecule (or RNA), and as a result, ensures the formation of any trait and its transmission by inheritance.

Genes represented in a population in several forms - alleles - are polymorphic genes. Alleles of genes are divided into dominant and recessive. Gene polymorphism provides a variety of traits within a species.

However, only a few traits are under the control of individual genes (for example, hair color). Productivity indicators, as a rule, are quantitative traits, for the development and manifestation of which many genes are responsible. Some of these genes may have a more pronounced effect. Such genes are called quantitative trait loci (QTL) core genes. Quantitative trait loci (QTLs) are DNA segments containing genes or linked to genes that underlie a quantitative trait.

For the first time, the idea of ​​using markers in breeding was theoretically substantiated by A.S. Serebrovsky back in the 20s. According to A. S. Serebrovsky, a marker (then called “signal”, the English term “marker” began to be used later) is an allele of a gene that has a clearly expressed phenotypic manifestation, localized next to another allele that determines an economically important trait under study, but does not have clear phenotypic manifestation; thus, making a selection for the phenotypic manifestation of this signal allele, there is a selection of linked alleles that determine the manifestation of the studied trait.

Initially, morphological (phenotypic) traits were used as genetic markers. However, very often quantitative traits have a complex nature of inheritance, their manifestation is determined by environmental conditions and the number of markers, which are used as phenotypic traits, is limited. Then, gene products (proteins) were used as markers. But the most effective way to test genetic polymorphism is not at the level of gene products, but directly at the level of genes, that is, using polymorphic DNA nucleotide sequences as markers.

Usually, DNA fragments that lie close together on a chromosome are inherited together. This property allows the marker to be used to determine the exact inheritance pattern of a gene that has not yet been accurately located.

Thus, markers are polymorphic regions of DNA with a known position on the chromosome, but unknown functions, which can be used to identify other genes. Genetic markers must be easily identifiable, linked to a specific locus, and highly polymorphic because homozygotes provide no information.

The widespread use of DNA polymorphism variants as genetic markers began in 1980. Molecular genetic markers were used for conservation programs for the gene pools of farm animal breeds, they were used to solve the problems of the origin and distribution of breeds, establishing kinship, mapping the main loci of quantitative traits, studying the genetic causes of hereditary diseases, acceleration of selection for individual traits - resistance to certain factors, according to productive indicators. In Europe, genetic markers have been used in pig breeding since the early 1990s. to free the population from the halothane gene, which causes stress syndrome in pigs.

There are several types of molecular genetic markers. Until recently, microsatellites were very popular, since they are widely distributed in the genome and have a high level of polymorphism. Microsatellites - SSR (Simple Sequence Repeats) or STR (Simple Tandem Repeats) consist of DNA segments 2-6 base pairs long, repeated in tandem many times. For example, the American company Applied Biosystems has developed a test system for genotyping 11 microsatellites (TGLA227, BM2113, TGLA53, ETH10, SPS115, TGLA126, TGLA122, INRA23, ETH3, ETH225, BM1824). However, microsatellites are not enough for fine mapping of individual regions of genomes, the high cost of equipment and reagents, and the development of automated methods using SNP chips are forcing them out of practice.

A very convenient type of genetic markers is SNP (Single Nucleotide Polymorphisms) - snip or single nucleotide polymorphism- these are differences in the DNA sequence of one nucleotide in the genome of representatives of the same species or between homologous regions of the homologous chromosomes of an individual. SNPs are point mutations that can occur as a result of spontaneous mutations and the action of mutagens. A difference of even one pair of bases can cause a change in a trait. SNPs are widely distributed in the genome (humans have about 1 SNP per 1000 base pairs). The pig genome has millions of point mutations. No other type of genomic difference is able to provide such a density of markers. In addition, SNPs have a low level of mutations per generation (~10-8), in contrast to microsatellites, which makes them convenient markers for population genetic analysis. The main advantage of SNPs is the possibility of using automatic methods for their detection, for example, the use of DNA templates.

In order to increase the number of SNP markers, a number of foreign companies have recently joined forces to create a single database in order to be able to test a large number of animals tested for productivity for polymorphism to identify the presence of links between known point mutations and productivity.

Currently, a large number of polymorphic gene variants and their mutual influence on the productive traits of pigs have been identified. Some genetic tests using performance markers are publicly available and used in breeding programs. Using such markers, you can improve some productive indicators.

Examples of productivity markers:

  • fertility markers: ESR, estrogen receptor gene; EPOR, erythropoietin receptor gene;
  • disease resistance markers – ECR F18 receptor gene;
  • markers of growth efficiency, meat productivity - MC4R, HMGA1, CCKAR, POU1F1.

MC4R - the melanocortin 4 receptor gene in pigs is located on chromosome 1 (SSC1) q22-q27. The replacement of one nucleotide A by G leads to a change in the amino acid composition of the MC4 receptor. As a result, there is a violation of the regulation of the secretion of adipose tissue cells, which leads to a violation of lipid metabolism and directly affects the process of formation of signs that characterize the fattening and meat qualities of pigs. The A allele determines rapid growth and large bacon thickness, and the G allele is responsible for growth efficiency and a large percentage of lean meat. Homozygous pigs with the AA genotype reach market weight three days faster than those homozygous for the G allele (GG), but pigs with the GG genotype have 8% less fat and have higher feed conversion.

Other genes that control a complex of associated physiological processes also affect meat and fattening productivity. The POU1F1 gene, a pituitary transcription factor, is a regulatory transcription factor that determines the expression of growth hormone and prolactin. In pigs, the POU1F1 locus is mapped on chromosome 13. Its polymorphism is due to a point mutation leading to the formation of two alleles, C and D. The presence of the C allele in the genotype of pigs is associated with increased average daily weight gain and greater precocity.

The markers also make it possible to test the boar genotype for sex-limited traits that appear only in sows. This, for example, is the fecundity (number of piglets per farrow), which the boar passes on to offspring. For example, testing the boar genotype for estrogen receptor (ESR) markers will allow selection of those boars for breeding that will pass on higher reproductive qualities to their daughters.

Using the results of marker selection, it is possible to estimate the frequency of occurrence of desirable and undesirable alleles for a breed or line, and further selection should be carried out so that all animals in the breed have only preferred gene alleles.

Rice. 1. The principle of operation of the oligonucleotide biochip

A DNA chip is a substrate with cells with a reagent substance deposited on it. The test material is marked with various labels (often fluorescent dyes) and applied to the biochip. As shown in the picture, the reagent substance - oligonucleotide - binds in the test material - fluorescently labeled DNA fragments - only a complementary fragment. As a result, a glow is observed on this element of the biochip.

In 2009, the pig genome was deciphered. SNP chip developed ( DNA microchip variant) containing 60,000 genetic markers of the genome. To speed up research, special robots were even created to read snips. A porcine DNA sample can be tested for the presence or absence of virtually all important point mutations that determine productive traits. Thus, the selection of the best animals can be based on genetic markers without measuring phenotypic indicators.

These advances have led to the introduction of a new technology - genomic selection. Genomic selection is the testing of the genome at once for a large number of markers covering the entire genome, so that quantitative trait loci (QTL) are in linkage disequilibrium with at least one marker. In genomic breeding, genome scanning occurs using chips (matrices) with 50-60 thousand SNPs (which mark the main genes of quantitative traits) to identify single nucleotide polymorphisms along the animal's genome, determine genotypes with the desired manifestation of a set of productive traits, and assess the breeding value of the animal.

The term "genomic selection" was first introduced by Haley and Wisher in 1998. Meuwissen et al. in 2001 developed and presented a methodology for the analytical evaluation of breeding value using a genome-wide marker map.

The practical application of genomic selection began in 2009.

Since 2009, the largest companies in the USA (Cooperative Resources International), the Netherlands, Germany, and Australia have begun to introduce genomic selection into cattle breeding programs. Bulls of various breeds have been genotyped for over 50,000 SNPs.

Hypor First to Announce Full Market Genomic Breeding Program, which will increase the accuracy of breeding in pig breeding. It was announced in the media in June 2012 that Hypor could offer its customers Genomic Breeding Value stock.

Hypor, a genetics company, has been using genomic selection since 2010, working closely with Hendrix Genetics' Center for Research and Emerging Technologies. Hendrix Genetics tests over 60,000 SNP markers and uses this information for DNA research. The genomic index of the genetic potential of pigs is calculated after the analysis of 60,000 gene markers (snip) for the animal. In theory, if there are enough genetic markers to cover all the DNA of a pig (its genome), it is possible to describe all genetic variations for all measurable traits. Modern mathematical-genetic software for data processing is being prepared.

The genetic company Hendrix Genetics has a large biobank - it stores blood and tissue samples of breeding animals of several farms and generations for DNA research (identification of the genetic value of animals) and analysis of the animal genotype. Hypor has been conducting pig DNA testing at its breeding facilities for over two years. All samples from different breeding plants located in different countries are sent for processing to the new Hendrix Genetics Central Genomic Laboratory in Ploufragant (France). Gerard Albers, Director of the Center for Research and New Technologies, emphasizes: "The Genomic Lab is a valuable asset shared by all the genetic companies that make up Hendrix Genetics and is truly unique in the swine industry."

Genomic selection is a powerful tool for future use. Currently, the effectiveness of genomic selection is limited by the different nature of the interaction between the loci of quantitative traits, the variability of quantitative traits in different breeds, and the influence of environmental factors on the manifestation of a trait. But the results of studies in many countries have confirmed that the use of statistical methods in conjunction with genomic scanning increases the reliability of breeding value prediction.

Selection of pigs using statistical methods for some indicators (for example, disease resistance, meat quality, fertility) is characterized by low efficiency. This happens due to the following factors:

  • low heritability of traits,
  • great influence on this sign of environmental factors,
  • due to sex-limited manifestation,
  • manifestation of a trait only under the influence of certain factors,
  • when the manifestation of the symptom occurs relatively late,
  • due to the fact that the characteristics are difficult to measure (for example, health characteristics),
  • the presence of hidden carriers-signs.

For example, such a defect in pigs as stress sensitivity is difficult to diagnose and manifests itself in increased mortality of piglets under the influence of stress (transportation, etc.) and deterioration in meat quality. DNA testing using gene markers makes it possible to identify all carriers of this defect, including latent ones, and to carry out selection taking this into account.

To assess productivity indicators that are difficult to predict by statistical methods, for a more reliable assessment of them, an analysis of the offspring is needed, that is, it is necessary to wait for the offspring and analyze its breeding value. And the use of DNA markers makes it possible to analyze the genotype immediately at birth, without waiting for the manifestation of a trait or the appearance of offspring, which significantly speeds up selection.

Index evaluation of animals is carried out according to the exterior and productive qualities (early maturity of piglets, etc.). In both cases, phenotypic indicators are used, therefore, in order to use these traits in calculations, it is necessary to know their heritability coefficient. However, even in this case, we will be dealing with the probability of the genetic substantiation of any trait, the average indicators of its ancestors and descendants (there is no way to determine which genes a young animal inherited: better or worse than this average). With the help of genotype analysis, it is possible to accurately establish the fact of inheritance of certain genes already at birth, to evaluate genotypes directly, and not through phenotypic manifestations.

However, if pigs are selected for traits that are highly heritable, such as an easily quantifiable number of teats, genomic selection will not bring significant benefits.

Marker selection does not negate traditional approaches to determining breeding value. Statistical analysis and genomic selection technologies complement each other. The use of genetic markers makes it possible to speed up the process of animal selection, and index methods - to more accurately assess the effectiveness of this selection.

Genomic breeding is an opportunity to make pig production a precision production. The use of genomic selection technologies will make it possible to produce a variety of meat products that meet consumer demand.

Republishing materials from this site is allowed only if you specify a hyperlink to the source of information!

A gene mapping method that uses family studies to determine the relationship between two genes as they are passed from one generation to the next. In order to decide if two loci are linked, and if so, how strongly, we rely on two kinds of information.

First, we establish whether the deviation is significant recombination frequency Q between two loci from 0.5; determining linkage between two loci is equivalent to asking if the proportion of recombinations between them differs from the 0.5 expected for unlinked loci.

Second, if the share recombinations less than 0.5 should be best estimated as this will show how closely linked the loci are. In both cases, the statistical likelihood ratio method is used. Likelihood - measures of probability, odds - ratios of likelihood. Likelihood ratios are calculated as follows.

study actual family data, count the number of children having recombination between loci, and finally calculate the likelihood (probability) of the observed Q value in the range from 0 to 0.5. A second probability is then calculated based on the hypothesis that the two loci are not linked, i. Q=0.50. We consider the likelihood ratio of the Q values ​​observed in family data and the likelihood under the condition of no linkage between the loci and thereby obtain the odds ratio:
1) Likelihood of the data if the loci are linked with some coefficient Q
2) Likelihood of data if loci are not linked (Q = 0)

Computed odds ratios for Q values, they are usually presented in the form of decimal logarithms and are called the LOD estimate (Z) of the logarithm of the odds (log of the odds). (Using logarithms allows you to combine data from different families by simple addition.)

Linkage analysis of Mendelian diseases based on models (prototypes)

Linkage analysis is called model (or parametric) if it is assumed that there is a specific type of inheritance (autosomal dominant, autosomal recessive, or X-linked) that explains the inheritance of the trait.

Analysis of the LOD score allows mapping of genes whose mutations cause diseases that are transmitted by the Mendelian type.

LOD score gives:
- the best estimate of the recombination frequency (Qmax) between the marker locus and the disease locus;
- an assessment of how well the linkage is confirmed by this value Qmax. LOD scores greater than 3 are considered reliable evidence.

Clutch with a specific Qmax value of a disease gene locus and a marker with a known physical position implies that the disease gene locus must be near the marker.

Attitude chances important in two ways. First, it provides a statistically correct method for using familial data in estimating the frequency of recombination between loci. This is because the statistical theory says that the value that gives the largest Z value is actually the best estimate of the fraction of recombination that can be made based on the available data. This value is called Qmax. If Q is different from 0.50, we get a linkage confirmation.

However, although Qmax- best score Q, how good is it? The odds ratio also answers this question, since the higher the Z value, the better the Qmax estimate. Positive Z values ​​(odds >1) at a given Q indicate that the two loci are linked, while negative values ​​(odds<1), предполагают, что сцепление менее вероятно, чем возможность, что два локуса не сцеплены.

Gene mapping Linkage analysis provides the ability to localize medically important genes based on disease inheritance and inheritance of alleles in polymorphic markers if the disease locus and polymorphic marker locus are linked. Let's return to the family shown in the figure. The mother has an autosomal dominant form of retinitis pigmentosa. There are dozens of other forms of this disease, many of which are mapped to specific locations within the genome and whose genes are known.

We do not know, which from the forms of retinitis pigmentosa has a mother. She is also heterozygous for two loci on chromosome 7 (one at 7p14 and one at the distal end of the long arm). It can be seen that in this family, the transmission of the mutant allele (D) invariably “followed” the B allele at the marker 2 locus from the first generation to the second. All three offspring with the disease (appearing to have inherited the maternal mutant D allele at the RP locus) also inherited the B allele at the marker 2 locus. All offspring that inherited the maternal normal d allele inherited the b allele and did not have retinitis pigmentosa. At the same time, the retinitis pigmentosa gene does not tend to follow the allele at the marker 1 locus.

We would probably get Q as " true» the proportion of recombination between the retinitis pigmentosa locus and locus 2 if we had an unlimited number of descendants. From this point of view, Q can be thought of as the probability that recombination occurs between two loci in each meiosis. Since recombination either occurs or it does not, the probability of recombination equal to Q and the probability of no recombination must add up to one. Therefore, the probability that recombination does not occur is Q-1. In fact, there are only six offspring without recombination.

Because everyone meiosis is an independent case, we multiply the probabilities of recombination (Q) or no recombination (Q-1) for each child. The probability of seeing no child with recombination and six without recombination between retinitis pigmentosa and locus marker 2 is thus Q°x(1-Q)6. The LOD score between retinitis pigmentosa loci and marker 2 is:

Maximum Z value is 1.81, occurring when Q=0, and suggesting, but not sure, the presence of linkage, since the value of Z is positive, but less than 3.

Combining LOD score information from different families

In the same way, each meiosis in a family producing non-recombinant or recombinant offspring - an independent case, so are meioses occurring in other families. Therefore, we can multiply the probabilities in the numerator and denominator of the likelihood ratios of individual families. A similar but more convenient calculation is to add the logarithms (log10) of all computed likelihood ratios to form a common Z-score for all families.

Pedigree of inheritance of retinitis pigmentosa

In the case of pigment retinitis In the figure, suppose two other families were studied and one did not show recombination between locus 2 and retinitis pigmentosa in four children, and the third did not show recombination in five children. Individual LOD scores were calculated for each family and then added together. In this case, one could say that the retinitis pigmentosa gene in this group of families is linked to locus 2.

Since the chromosome position polymorphic locus 2 is known- 7p14, retinitis pigmentosa in this family may map to the area around 7p14, near the RP9 locus already defined for one form of autosomal dominant retinitis pigmentosa.

Share with friends or save for yourself:

Loading...