Back

Biology

MDPI AG

All preprints, ranked by how well they match Biology's content profile, based on 11 papers previously published here. The average preprint has a 0.06% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.

1
Comparison of the capacity of several machine learning tools to assist immunofluorescence-based detection of anti-neutrophil cytoplasmic antibodies

Bertin, D.; Bongrand, P.; Bardin, N.

2024-01-28 allergy and immunology 10.1101/2024.01.26.24301725
Top 0.1%
85× avg
Show abstract

The success of artificial intelligence and machine learning is an incentive to develop new algorithms to increase the rapidity and reliability of medical diagnosis. Here we compared different strategies aimed at processing microscope images used to detect anti-neutrophil cytoplasmic antibodies, an important vasculitis marker: (i) basic classifier methods (logistic regression, k-nearest neighbors and decision tree) were used to process custom-made indices derived from immunofluorescence images yielded by 137 sera. (ii) These methods were combined with dimensional reduction to analyze 1733 individual cell images. iii) More complex models based on neural networks were used to analyze the same dataset. The efficiency of discriminating between positive and negative samples and different fluorescence patterns was quantified with Rand-type accuracy index, kappa index and ROC curve. It is concluded that basic models trained on a limited dataset allowed positive/negative discrimination with an efficiency comparable to that obtained by conventional analysis performed by humans (0.84 kappa score). More extensive datasets may be required for efficient discrimination between different fluorescence patterns generated by different auto-antibody species.

2
A simple Covid-19 Epidemic Model and Containment Policy in France

Quadrat, J.-P.

2020-04-29 epidemiology 10.1101/2020.04.25.20079434
Top 0.1%
80× avg
Show abstract

We show that the standard SIR model is not effective to predict the 2019-20 coronavirus pandemic propagation. We propose a new model where the logarithm of the detected population number follows a linear dynamical system. We estimate the parameters of this system and compare models obtained with data observed from different countries. Based on the given estimator and results obtained with the Pr. Raoults treatment, we affirm with a reasonable degree of confidence that his "test-treat-noconfine" policy was less expensive in human lives than the"confine and wait for a proved treatment" policy adopted by the French government.

3
On the corona infection model with contact restriction

Mimkes, J.; Janssen, R.

2020-04-11 epidemiology 10.1101/2020.04.08.20057588
Top 0.1%
54× avg
Show abstract

This article presents a mathematical infection model that is designed to estimate the course of coronavirus infection in Germany for several days in advance: How many people become ill or die, what is the temporal development? If the contact restriction is perfect, then the model predicts the development of the virus infection after the initial subsidence of the infection. However, since this restriction cannot always be strictly adhered to, the model is dynamically adapted to the development. This makes it possible to estimate the number of infected people, the number of new infections and deaths in Germany about a week in advance.

4
Covid-19 Epidemic Prediction in France: the Multimodal Case.

Quadrat, J.-P.

2021-10-12 epidemiology 10.1101/2021.10.09.21264794
Top 0.1%
53× avg
Show abstract

In two previous papers we have proposed models to estimate the Covid-19 epidemic when the number of daily positive cases has a bell shaped form that we call a mode. We have observed that each Covid variant produces this type of epidemic shape at a different moment, resulting in a multimodal epidemic shape. We will show in this document that each mode can still be estimated with models described in the two previous papers provides we replace the cumulated number of positive cases y by the cumulated number of positive cases reduced by a parameter P to be estimated. Therefore denoting z the logarithm of y -P, z follows approximately the differential equation [z] = b -azr where a, b, r have also to be estimated from the observed data. We will show the obtained predictions on the four French modes April, November 2020, May and September 2021. The comparison between the prediction obtained before the containment decisions made by the French government and the observed data afterwards suggests the inefficiency of the epidemic lockdowns.

5
Morphological and Functional Assessment of Thyroid in Individuals with Down Syndrome

Kalil Mangabeira, C. N.; Kalil Mangabeira, R.; Oliveira Andrade, L. J. d.

2021-05-17 endocrinology 10.1101/2021.05.13.21256919
Top 0.1%
51× avg
Show abstract

Individuals with Downs syndrome (DS) present increased risk for thyroid dysfunction, especially hypothyroidism, due in increased expression of the DYRK1A gene. ObjectiveThe aim of this study was to make a morphological-functional thyroid assessment in individuals with DS. Materials and MethodsThis is a descriptive cross-sectional study, consisting of 29 individuals with DS, with a mean age of 12,3 {+/-} 9,5 (0.66 - 36.00) years, 16 women (55.2%) and 13 men (44.8%), with a morphological-functional thyroid assessment being made comprising hormonal dose (Free T4, TSH), antithyroid antibody (TPOAb and TgAb) and ultrasonography of the thyroid. ResultsTwenty-three (79.3%) individuals presented normal thyroid function while 6 (20.7%) presented with thyroid dysfunction, 4 with hypothyroidism and 2 with hyperthyroidism. Autoimmune thyroiditis and goiter were present in 27.6% of the individuals. ConclusionThyroid function should be assessed periodically in individuals with DS, in view of the high prevalence of thyroid dysfunction, especially autoimmune thyroiditis with consequent hypothyroidism.

6
PATCRdb: Database of TCRs from data mining patent documents

Lee, Y.; Freitag, R.; Ganesan, R.; Schwammle, V.; Kumar, S.; Krawczyk, K.

2023-01-07 allergy and immunology 10.1101/2023.01.05.23284150
Top 0.1%
51× avg
Show abstract

T-cells are crucial actuators of the innate immune system. Because their receptors recognize intracellular disease markers, there is considerable interest in developing them as novel biotherapies. Computational methods to support discovery, design and development of TCR-based therapeutics need robust repositories of curated sequence and structural information on TCRs. The urgency of this need is highlighted by the recent approval of the first TCR biotherapeutic, tebentafusp. In this work, we have collected patent data on TCR sequences to provide early access to TCRs that are in various stages of product and clinical development (pre-FDA approvals) and are already past the initial discovery / proof of concept (scientific publications) stages. We employ literature mining to identify patent documents disclosing TCR sequences. Such documents are further analyzed to provide a birds-eye view of TCR patenting landscape. We compile the information into a database available at http://github.com/konradkrawczyk/patcrdb that we hope should help TCR engineers.

7
SNP Genes Effect on Thyroid Disorders in a Chinese Demographic

Fan, I.; Zhou, F.

2024-08-31 endocrinology 10.1101/2024.08.29.24312609
Top 0.1%
51× avg
Show abstract

Thyroid disorders, particularly hypothyroidism, are prevalent in the Chinese population and have been linked to specific genetic variations. This study investigates the associations between single nucleotide polymorphisms (SNPs) and thyroid disorders in a cohort of Chinese individuals. It aims to explore a novel aspect of thyroid disorders, precisely the effect of different SNPs on the prevalence of developing these disorders, autoimmune diseases, or cancer. It focuses on four SNPs: rs965513, rs179247, rs3087243, and rs231779. The analysis revealed significant associations between these SNPs and thyroid disorders, with the A allele of rs179247 showing a higher risk.

8
Screening for Maternally Inherited Diabetes and Deafness in Large Cohorts of Hearing Impaired and Diabetic Patients

Varga, L.; Borecka, S.; Skopkova, M.; Rambani, V.; Sklenar, M.; Cipkova, K.; Kickova, T.; Ugorova, D.; Kabatova, Z.; Stanik, J.; Profant, M.; Gasperikova, D.

2025-03-17 endocrinology 10.1101/2025.03.13.25321027
Top 0.1%
49× avg
Show abstract

ObjectivesMitochondrial DNA (mtDNA) mutations account for up to 5% of hereditary hearing loss cases. Most commonly, the m.3243A>G mtDNA variant contributes to rare monogenic MIDD (Maternally Inherited Diabetes and Deafness) or MELAS (Mitochondrial Encephalopathy, Lactic Acidosis, and Stroke-like episodes) syndromes. Different proportions of the mutated mtDNA (heteroplasmy) among the affected tissues result in variability in the clinical manifestation and severity of the phenotype. The aim of the presented study was to establish the prevalence of the m.3243A>G variant in large cohorts of hearing-impaired and diabetic patients in Slovakia and to evaluate the genotype-phenotype correlations and long-term cochlear implantation outcomes. DesignProbands (n=5957) were recruited via three independent nationwide studies on hereditary hearing loss (n=1145) and diabetes (unselected diabetes group, n=4158 and Monogenic diabetes group, n=654; total n=4812). DNA from peripheral blood and/or buccal mucosa was tested for the presence of the m.3243A>G variant using two PCR methods - qPCR and dPCR. Audiological and other clinical data of the identified variant carriers were also collected for phenotype evaluation. ResultsWe identified 25 probands/families harboring the m.3243A>G variant (0.42%). The prevalence was higher in the groups where monogenic disorder was suspected - 0.79% in the Hearing loss group and 1.68% in the Monogenic diabetes group versus 0.14% in the general diabetes group (p < 0.001). Heteroplasmy levels assessed by dPCR ranged between 0.04% and 76% in peripheral blood and 0.01% and 92% in buccal samples. In most individuals, the symptoms manifested in the fourth decade of life in affected subjects with the MIDD phenotype or isolated hearing loss/diabetes, but as early as in the second decade in the probands with MELAS. We observed high phenotype variability, ranging from severe multisystemic involvement through isolated symptoms to asymptomatic young "dormant" or very low heteroplasmy carriers. Only 54% of individuals with the m.3243A>G variant had both diabetes and hearing loss. The heteroplasmy levels from buccal swabs showed a better correlation with the age of onset of both hearing loss and diabetes than the age-adjusted blood heteroplasmy. On the other hand, the age-adjusted blood heteroplasmy was associated with overall severity of the disease (i.e., with a higher number of clinical symptoms). We show that the most typical audiogram configurations are flat and sloping. Three individuals identified as cochlear implant recipients showed excellent and long-term stable functional outcomes. In addition, the authors report the first case of successful stapes surgery in a patient with confirmed mitochondrial disorder. ConclusionsThe diagnostic yield was higher in the deafness and monogenic diabetes groups than in the unselected diabetes group. Implementation of rigorous inclusion criteria requiring the presence of both diabetes and hearing loss may lead to a lower detection rate due to different or incomplete phenotype manifestation. Age-adjusted blood heteroplasmy levels seem to be a good predictor of overall severity of m.3243A>G-associated diseases, but buccal mucosa heteroplasmy better predicted the age of hearing loss and diabetes onset. We further confirm that cochlear implantation and stapedectomy are safe and efficient options for hearing restoration and rehabilitation in m.3243A>G carriers.

9
Trend analysis of the COVID-19 pandemic in China and the rest of the world

Weber, A.; Iannelli, F.; Goncalves, S.

2020-03-23 epidemiology 10.1101/2020.03.19.20037192
Top 0.1%
48× avg
Show abstract

The recent epidemic of Coronavirus (COVID-19) that started in China has already been "exported" to more than 140 countries in all the continents, evolving in most of them by local spreading. In this contribution we analyze the trends of the cases reported in all the Chinese provinces, as well as in some countries that, until March 15th, 2020, have more than 500 cases reported. Notably and differently from other epidemics, the provinces did not show an exponential phase. The data available at the Johns Hopkins University site [1] seem to fit well an algebraic sub-exponential growing behavior as was pointed out recently [2]. All the provinces show a clear and consistent pattern of slowing down with growing exponent going nearly zero, so it can be said that the epidemic was contained in China. On the other side, the more recent spread in countries like, Italy, Iran, and Spain show a clear exponential growth, as well as other European countries. Even more recently, US --which was one of the first countries to have an individual infected outside China (Jan 21st, 2020)-- seems to follow the same path. We calculate the exponential growth of the most affected countries, showing the evolution along time after the first local case. We identify clearly different patterns in the analyzed data and we give interpretations and possible explanations for them. The analysis and conclusions of our study can help countries that, after importing some cases, are not yet in the local spreading phase, or have just started. HIGHLIGHTSO_LIAll the provinces of China show very similar epidemic behaviour. C_LIO_LIEarly stages of spreading can be explained in terms of SIR standard model, considering that reported cases accounts for the removed individuals, with algebraic growing (sub-exponential) in most locations. C_LIO_LIWorldwide, we observe two classes of epidemic growth: sub-exponential during almost all stages (China and Japan) and exponential on the rest of the countries, following the early stage. C_LIO_LIThe exponential growth rates ranges from 0.016day-1 (South Korea) to 0.725day-1 (Brunei) which means 1.6% to 107% of new cases per day, for the different countries but China. C_LI

10
The age-stratified analytical model for the spread of the COVID-19 epidemic

Mairanowski, F.; Below, D.

2021-07-15 epidemiology 10.1101/2021.07.13.21260459
Top 0.1%
47× avg
Show abstract

The previously developed ASILV model for calculating epidemic spread under conditions of lockdown and mass vaccination was modified to analyse the intensity of COVID-19 infection growth in the allocated age groups. Comparison of the results of calculations of the epidemic spread, as well as the values of the seven-day incidence values with the corresponding observation data, shows their good correspondence for each of the selected age groups. The greatest influence on the overall spread of the epidemic is in the 20-40 age groups. The relatively low level of vaccination and the high intensity of contact in these age groups contributes to the emergence of new waves of the epidemic, which is especially active when the virus mutates and the lockdown conditions are relaxed. The intensity of the epidemic in the 90+ age group has some peculiarities compared to other groups, which may be explained by differences in contact patterns among individuals in this age group compared to others. Approximate ratios for estimating mortality as a function of the intensity of infection for individual age groups are provided. The proposed stratified ASILV model by age group will allow more detailed and accurate prediction of the spread of the COVID-19 epidemic, including when new, more transmissible versions of the virus mutate and emerge.

11
Immunological variables and tumor types influence one-year survival probability in cancer patients: A comprehensive analysis using logistic regression and decision tree models

Lopez Malizia, A.

2023-10-26 allergy and immunology 10.1101/2023.10.25.23297566
Top 0.1%
47× avg
Show abstract

The present study aimed to explore immunological variables associated with survival, TP53 gene expression, and primary diagnosis in patients with cancer. Based on these variables, logistic regression and decision tree models (lightGBM) were used to model the probability of one-year survival of patients following their initial diagnosis. Logistic regression revealed the significance of primary diagnosis categories such as Malignant Melanoma, Ovarian Cancer, and Glioblastoma as predictor variables. For the classification model, in addition to these tumor types, variables related to the immune system were also found to be important, including tumor cell percentage, stromal cell percentage, lymphocytes, and necrotic cells. In addition, unsupervised classification techniques were employed to explore the numerical dataset. For this methodology, the best clustering cohesion was observed with two groups determined using different algorithms. The clusters generated by k-means and DBSCAN exhibited differences in the proportion of infiltrating lymphocytes, neutrophils, and monocytes.

12
Predicting Common Pathway Signatures Between DNA Methylation and Post Translational Modification in Type II Diabetes & Parkinson's Disease Using Heterogeneous Data Integration

Biswas, S.; Mitra, P.; Rao, K. S.

2024-09-27 endocrinology 10.1101/2024.09.26.24314438
Top 0.1%
47× avg
Show abstract

The complex diseases, namely, Type 2 Diabetes Mellitus (T2DM) and Parkinsons Disease (PD), are extensively studied due to their prevalence in a large population group. Between these two diseases, T2DM is denoted as the zero index disease in a patient, which may lead to PD in a more advanced clinical stage. Both of these diseases may occur due to abrupt DNA methylation of genes. Likewise, both diseases may occur in a patient due to protein misfolding. Our study proposes a novel framework for building two disease-specific heterogeneous networks by integrating different tissue-based transcriptomics, epigenetics, epistasis, and PPI-based topological information. We predict the missing links between the DNA methylation and Post-Translational Modification (PTMs) associated with protein aggregation. Next, we have predicted the common signature of the prevalence of linked patterns in both diseases, further validated by relevant biological evidence.

13
A New Perspective On Isotretinoin In Pregnancy: Pregnancy Outcomes, Evaluation Of Complex Phenotypes, And Importance Of Teratological Counselling

Alay, M. T.; Kalayci, A.; Seven, M.

2023-06-05 public and global health 10.1101/2023.06.02.23290862
Top 0.1%
46× avg
Show abstract

BackgroundTeratogens are responsible for 5% of all known causes of congenital anomalies. Isotretinoin, a retinoic acid-derived agent, leads to congenital anomalies in 21-52% of cases when exposure occurs during pregnancy according to studies conducted before 2006. However, rates of congenital anomalies were much lower in later studies. ObjectivesTo investigate the rates of congenital anomalies in isotretinoin exposure during pregnancy, isotretinoin exposure before pregnancy, and a control group unexposed to any teratogenic agents. MethodsIn this cohort study, we divided pregnant women admitted to our center between 2009 and 2020 into two groups: isotretinoin exposure during the pregnancy (n=77) and isotretinoin exposure before the pregnancy (n=75). We selected the control group from among the non-teratogen exposed pregnant women with a simple random sampling method. Obstetricians calculated the ages of all pregnancies via ultrasound (USG) (crown-rump diameter for the first trimester; biparietal diameter and femur length for the second trimester). After birth, a pediatric genetics specialist examined all babies. ResultsAmong the isotretinoin exposure during the pregnancy, isotretinoin exposure before the pregnancy, and the control groups, there were statistically significant differences in live births (respectively, 64.3%, 88%, 93.3%), congenital anomalies (respectively, 28.6%, 6.1%, 1.4%), miscarriages (respectively, 13%, 2.7%, 4%), terminations (respectively, 32.5%, 9.3%, 2.7%), prematurities (11.9%, 16.7%, 2.9%) (respectively, p < 0.001, p<0.001, p=0.014, p<0.001). We detected novel phenotypical features in five patients. ConclusionsOur study demonstrated that study design, long-term follow-up, teratological counseling, and implementing advanced molecular analysis in complex phenotypes with novel phenotypical features are beneficial for understanding the association of congenital anomalies with isotretinoin exposure. While evaluating congenital anomalies, we detected statistically significant differences between isotretinoin exposure vs control, but we did not detect any statistical differences between isotretinoin exposure before the pregnancy and controls. This conflict between our study and previous studies might be caused by no evident differentiation between isotretinoin exposure before the pregnancy and during the pregnancy and higher termination rates in previous studies.

14
Two viruses competition in the SIR model of epidemic spread: application to COVID-19

Trigger, S. A.; Ignatov, A. M.

2022-01-11 epidemiology 10.1101/2022.01.11.22269046
Top 0.1%
44× avg
Show abstract

The SIR model of the epidemic spread is used for consideration the problem of the competition of two viruses having different contagiousness. It is shown how the more contagious strain replaces over time the less contagious one. In particular the results can be applied to the current situation when the omicron strain appeared in population affected by the delta strain. PACS number(s)02.50.-r, 05.60.-k, 82.39.-k, 87.19.Xx

15
A Covid-19 case mortality rate without time delay systematics

Lieu, R.; Quenby, S.; Jiang, A. B.-z.

2020-04-06 epidemiology 10.1101/2020.03.31.20049452
Top 0.1%
44× avg
Show abstract

Concerning the two approaches to the Covid-19 case mortality rate published in the literature, namely computing the ratio of (a) the daily number of deaths to a time delayed daily number of confirmed infections; and (b) the cumulative number of deaths to confirmed infections up to a certain time, both numbers having been acquired in the middle of an outbreak, it is shown that each suffers from systematic error of a different source. We further show that in the absence of detailed knowledge of the time delay distribution of (a), the true case mortality rate is obtained by pursuing method (b) at the end of the outbreak when the fate of every case has decisively been rendered. The approach is then employed to calculate the mean case mortality rate of 13 regions of China where every case has already been resolved. This leads to a mean rate of 0.527 {+/-} 0.001 %.

16
Modeling the COVID-19 epidemic in Okinawa

Pigolotti, S.; Chiuchiu, D.; Villa Martin, P.; Bhat, D.

2020-04-22 epidemiology 10.1101/2020.04.20.20071977
Top 0.1%
44× avg
Show abstract

We analyze current data on the COVID-19 spreading in Okinawa, Japan. We find that the initial spread is characterized by a doubling time of about 5 days. We implement a model to forecast the future spread under different scenarios. The model predicts that, if significant containment measures are not taken, a large fraction of the population will be infected with COVID-19, with the peak of the epidemic expected at the end of May and intensive care units having largely exceeded capacity. We analyzed scenarios implementing strong containment measures, similar to those imposed in Europe. The model predicts that an immediate implementation of strong containment measures (on the 19th of April) will significantly reduce the death count. We assess the negative consequences of these measures being implemented with a delay, or not being sufficiently stringent.

17
Geometric approach for non pharmaceutical interventions in epidemiology

Evain, L.; Loeb, J.-J.

2023-05-05 epidemiology 10.1101/2023.05.05.23289577
Top 0.1%
44× avg
Show abstract

Various non pharmaceutical interventions have been settled to minimise the burden of the COVID-19 outbreak. We build a framework to analyse the dynamics of non pharmaceutical interventions, to distinguish between mitigations measures leading to objective scientific improvements and mitigations based on both political and scientific considerations. We analyse two possible strategies within this framework. Namely, we consider mitigations driven by the limited resources of the health system and mitigations where a constant set of measures is applied at different moments. We describe the optimal interventions for these scenarios. Our approach involves sir differential systems, it is qualitative and geometrical rather than computational. Along with the analysis of these scenarios, we collect several results that may be useful on their own, in particular on the ground when the variables are not known in real time.

18
Monitoring and forecasting the COVID-19 epidemic in Moscow: model selection by balanced identification technology - version: September 2021

Sokolov, A.; Sokolova, L.

2021-10-11 epidemiology 10.1101/2021.10.07.21264713
Top 0.1%
43× avg
Show abstract

A mathematical model is a reflection of knowledge on the real object studied. The paper shows how the accumulation of data (statistical data and knowledge) about the COVID-19 pandemic lead to gradual refinement of mathematical models, to the expansion of the scope of their use. The resulting model satisfactorily describes the dynamics of COVID-19 in Moscow from 19.03.2020 to 01.09.2021 and can be used for forecasting with a horizon of several months. The dynamics of the model is mainly determined by herd immunity. Monitoring the situation in Moscow has not yet (as of 01.09.2021) revealed noticeable seasonality of the disease nor an increase in infectivity (due to the Delta strain). The results of using balanced identification technology to monitor the COVID-19 pandemic are: O_LImodels corresponding to the data available at different points in time (from March 2020 to August 2021); C_LIO_LInew knowledge (dependencies) acquired; C_LIO_LIforecasts for the third and fourth waves in Moscow. C_LI Discrepancies that manifested after 01.09.2021 and possible further modifications of the model are discussed

19
An Ontology-Based Analysis of Health Disorders Derived from Stress-Evoked Peripheral Immune Responses

Burgos, J.; Sierra, C.

2025-09-15 allergy and immunology 10.1101/2025.09.14.25335729
Top 0.1%
43× avg
Show abstract

Chronic stress affects over 300 million individuals worldwide, contributing to a rising incidence of diseases associated with the peripheral immune response triggered by this condition, including depression, inflammatory bowel disease, metabolic syndrome, and coronary heart disease. To establish a structured understanding of these associations, an ontological approach based on Formal Concept Analysis, a mathematical framework for order relations is employed to construct a conceptual hierarchy linking chronic stress to these diseases. Within this framework, the objects represent the set of stress-induced diseases, while the attributes correspond to specific combinations of chemokines and cytokines clinically associated with each condition. The findings of the ontological analysis suggest that stress-related diseases follow a staged progression: an initial induction phase, common to all diseases, characterized by the presence of chemokines and cytokines that induce a state of chronic inflammation (inflammaging); a subsequent progression phase, marked by immune response effector molecules that may be shared across different diseases; and a final consolidation phase, in which specific chemo- and cytokines distinctive to each disease are expressed.

20
A Tabular Residual Neural Network for Diabetes Classification and Prediction

Hammond, A.; Afridi, M.; Balakrishna, K.

2025-12-29 endocrinology 10.64898/2025.12.29.25343132
Top 0.1%
43× avg
Show abstract

Diabetes Mellitus (DM) is a metabolic disorder characterized by hyperglycemia, with type 1 characterized as an autoimmune destruction of pancreatic beta cells and type 2 characterized by insulin resistance with progressive beta cell dysfunction. This study applied an existing binary classification algorithm (ALTARN) to accurately predict DM. ALTARN, as a tabular attention residual neural network, uses residual connection to find complex patterns present in tabular columns. We achieved an average training accuracy of 75.22%. Furthermore, a robust set of validation metrics was obtained via five-fold stratified cross-validation, yielding an average accuracy of 74.61%, an average precision of 72.36%, a mean recall of 79.69%, and a mean F1 score of 75.83%.