Mathematics
○ MDPI AG
All preprints, ranked by how well they match Mathematics's content profile, based on 11 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.
Lee, T.- W.; Park, J. E.; Hung, D.
Show abstract
Covid-19 is characterized by rapid transmission and severe symptoms, leading to deaths in some cases (ranging from 1.5 to 12% of the affected, depending on the country). We identify the Gaussian nature of mortality due to covid-19, as shown in China where it appears to have run its course (during the first sweep of the pandemic at least) and other coutnries, and also in Imperial College modeling. Gaussian distribution involves three parameters, the height, peak location and the width, and the streaming data can be used to infer function value, slope and inflection location as a minimum set of constraints to estimate the subsequent trajectories. Thus, we apply the Gaussian function template as the basis for a data-assimilated model of covid-19 trajectories, first to USA, United Kingdom (UK), Iran and the world total in this study. As more data become available, the Gaussian trajectories are updated, for other nations and also for state-by-state projections in USA.
Furuyama, T. N.; de Carvalho Mello, I. M. V. G.; Janini, L. M. R.; Antoneli, F. M.
Show abstract
The problem of empirical estimation of mutation rates is fundamental for the understanding of viral evolution. The estimation of viral mutation rates is based on varied and often complex methods carried out through experiments essentially designed to count mutation frequencies. Mutation rates are defined as the probabilities of nucleotide substitutions, typically reported as a single number in units of mutation (substitution) per base (nucleotide) per replication cycle or per cell infection, depending on the replication mode of the virus. Even more, the uncertainty quantification of these estimates is so difficult that it is rare to find it reported in the literature. The values for the same virus reported in literature fall within a broad range, sometimes spanning two orders of magnitude. For instance, the mutation rates range from 10-8 to 10-6 mutation per base per cell infection for DNA viruses and from 10-6 to 10-4 mutation per base per cell infection for RNA viruses. In this paper, we propose an alternative perspective on the estimation of mutational rates, which avoids the use of consensus sequences and/or serial passages. Our approach leverages the large amount of sequencing data produced by high throughput sequencing technologies coupled to an experimental design that performs a single replication cycle from an initial clonal viral population. We propose to replace the single numeric mutation rate with a distribution of mutation rates (DMR), together with a procedure to implement the estimation of this distribution from sequencing data and show that it can be estimated from sequencing data. Even though the focus of this paper is the development of the approach centered on the DMR it is straightforward to produce point and interval estimates of the mutation rates, including uncertainty quantification. In addition to the estimation of the DMR, we provide a theoretical characterization of it, as being well-approximated by a log-normal distribution. Finally, we study some non-trivial properties of the DMR related to a remarkable invariance under down-scaling the distribution from the genome to its subunits.
Dolgikh, S.
Show abstract
Analysis of small datasets presents a number of essential challenges not in the least due to insufficient sampling of characteristic patterns in the data making confident conclusions about the unknown distribution elusive and resulting in lower statistical confidence and higher error. In this work, a novel approach to augmentation of small datasets is proposed based on an ensemble of neural network models of unsupervised generative self-learning. Applying generative learning with an ensemble of individual models allowed to identify stable clusters of data points in the latent representations of the observable data. Several techniques of augmentation based on identified latent cluster structure were applied to produce new data points and enhance the dataset. The proposed method can be used with small and extremely small datasets to identify characteristics patterns, augment data and in some cases, improve accuracy of classification in the scenarios with strong deficit of labels.
Miyake, J.; Sato, T.; Baba, S.; Nakamura, H.; Niioka, H.; Nakazawa, Y.
Show abstract
We report on a method for analyzing the variant of coronavirus genes using autoencoder. Since coronaviruses have mutated rapidly and generated a large number of genotypes, an appropriate method for understanding the entire population is required. The method using autoencoder meets this requirement and is suitable for understanding how and when the variants emarge and disappear. For the over 30,000 SARS-CoV-2 ORF1ab gene sequences sampled globally from December 2019 to February 2021, we were able to represent a summary of their characteristics in a 3D plot and show the expansion, decline, and transformation of the virus types over time and by region. Based on ORF1ab genes, the SARS-CoV-2 viruses were classified into five major types (A, B, C, D, and E in the order of appearance): the virus type that originated in China at the end of 2019 (type A) practically disappeared in June 2020; two virus types (types B and C) have emerged in the United States and Europe since February 2020, and type B has become a global phenomenon. Type C is only prevalent in the U.S. and is suspected to be associated with high mortality, but this type also disappeared at the end of June. Type D is only found in Australia. Currently, the epidemic is dominated by types B and E.
AL-Mekhlafi, S. M.; Bonyah, E.
Show abstract
This paper introduces an optimal control strategy for choleras crossover mathematical model. The proposed model integrates {Psi}-Caputo fractal variable-order derivatives, fractal fractional-order derivatives, and integer-order derivatives across three distinct time intervals, utilizing a simple non-standard kernel function {Psi}(t). A comprehensive stability analysis of the models steady states is conducted. The models results are compared with real-world data from the cholera outbreak in Yemen. Following this, an optimal control problem is formulated within the crossover framework. To numerically solve the resulting optimality system, a discretized non-standard -finite difference method is developed. Numerical simulations and comparative studies are presented to demonstrate the methods applicability and the efficiency of the approximation approach. The key finding of this study highlights that the crossover-controlled system proves to be the most effective approach for mitigating and controlling the spread of cholera.
Hazem, Y.; Natarajan, S.; Berikaa, E.
Show abstract
The outbreak of COVID-19 has an undeniable global impact, both socially and economically. March 11th, 2020, COVID-19 was declared as a pandemic worldwide. Many governments, worldwide, have imposed strict lockdown measures to minimize the spread of COVID-19. However, these measures cannot last forever; therefore, many countries are already considering relaxing the lockdown measures. This study, quantitatively, investigated the impact of this relaxation in the United States, Germany, the United Kingdom, Italy, Spain, and Canada. A modified version of the SIR model is used to model the reduction in lockdown based on the already available data. The results showed an inevitable second wave of COVID-19 infection following loosening the current measures. The study tries to reveal the predicted number of infected cases for different reopening dates. Additionally, the predicted number of infected cases for different reopening dates is reported.
Anh, V. V.; Nguyen, H. T.; Craig, A.; Tran, E.; Wang, Y.
Show abstract
This paper investigates the cause and detection of power-law scaling of brain wave activity due to the heterogeneity of the brain cortex, considered as a complex system, and the initial condition such as the alert or fatigue state of the brain. Our starting point is the construction of a mathematical model of global brain wave activity based on EEG measurements on the cortical surface. The model takes the form of a stochastic delay-differential equation (SDDE). Its fractional diffusion operator and delay operator capture the responses due to the heterogeneous medium and the initial condition. The analytical solution of the model is obtained in the form of a Karhunen-Loeve expansion. A method to estimate the key parameters of the model and the corresponding numerical schemes are given. Real EEG data on driver fatigue at 32 channels measured on 50 participants are used to estimate these parameters. Interpretation of the results is given by comparing and contrasting the alert and fatigue states of the brain. The EEG time series at each electrode on the scalp display power-law scaling, as indicated by their spectral slopes in the low-frequency range. The diffusion of the EEG random field is non-Gaussian, reflecting the heterogeneity of the brain cortex. This non-Gaussianity is more pronounced for the alert state than the fatigue state. The response of the system to the initial condition is also more significant for the alert state than the fatigue state. These results demonstrate the usefulness of global SDDE modelling complementing the time series approach for EEG analysis.
Gonnet, G.; Stewart, J.; Lafleur, J.; Keith, S.; McLellan, M.; Jiang-Gorsline, D.; Snider, T.
Show abstract
We have developed a new technique of Feature Importance, a topic of machine learning, to analyze the possible causes of the Covid-19 pandemic based on country data. This new approach works well even when there are many more features than countries and is not affected by high correlation of features. It is inspired by the Gram-Schmidt orthogonalization procedure from linear algebra. We study the number of deaths, which is more reliable than the number of cases at the onset of the pandemic, during Apr/May 2020. This is while countries started taking measures, so more light will be shed on the root causes of the pandemic rather than on its handling. The analysis is done against a comprehensive list of roughly 3,200 features. We find that globalization is the main contributing cause, followed by calcium intake, economic factors, environmental factors, preventative measures, and others. This analysis was done for 20 different dates and shows that some factors, like calcium, phase in or out over time. We also compute row explainability, i.e. for every country, how much each feature explains the death rate. Finally we also study a series of conditions, e.g. comorbidities, immunization, etc. which have been proposed to explain the pandemic and place them in their proper context. While there are many caveats to this analysis, we believe it sheds light on the possible causes of the Covid-19 pandemic. One-Sentence SummaryWe use a novel feature importance technique to find that globalization, followed by calcium intake, economic factors, environmental factors, and some aspects of societal quality are the main country-level data that explain early Covid-19 death rates.
Frausto-Solis, J.; Olvera Vazquez, J. E.; Gonzalez-Barbosa, J. J.; Castilla-Valdez, G.; Sanchez-Hernandez, J. P.; Perez-Ortega, J.
Show abstract
We know that SARS-Cov2 produces the new COVID-19 disease, which is one of the most dangerous pandemics of modern times. This pandemic has critical health and economic consequences, and even the health services of the large, powerful nations may be saturated. Thus, forecasting the number of infected persons in any country is essential for controlling the situation. In the literature, different forecasting methods have been published, attempting to solve the problem. However, a simple and accurate forecasting method is required for its implementation in any part of the world. This paper presents a precise and straightforward forecasting method named SVR-ESAR (Support Vector regression hybridized with the classical Exponential smoothing and ARIMA). We applied this method to the infected time series in four scenarios, which we have taken for the Github repository: the Whole World, China, the US, and Mexico. We compared our results with those of the literature showing the proposed method has the best accuracy.
Balasubramanian, K.; Nagaraj, N.
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWFinding vaccine or specific antiviral treatment for global pandemic of virus diseases (such as the ongoing COVID-19) requires rapid analysis, annotation and evaluation of metagenomic libraries to enable a quick and efficient screening of nucleotide sequences. Traditional sequence alignment methods are not suitable and there is a need for fast alignment-free techniques for sequence analysis. Information theory and data compression algorithms provide a rich set of mathematical and computational tools to capture essential patterns in biological sequences. In 2013, our research group (Nagaraj et al., Eur. Phys. J. Special Topics 222(3-4), 2013) has proposed a novel measure known as Effort-To-Compress (ETC) based on the notion of compression-complexity to capture the information content of sequences. In this study, we propose a compression-complexity based distance measure for automatic identification of SARS coronavirus strains from a set of viruses using only short fragments of nucleotide sequences. We also demonstrate that our proposed method can correctly distinguish SARS-CoV-2 from SARS-CoV-1 viruses by analyzing very short segments of nucleotide sequences. This work could be extended further to enable medical practitioners in automatically identifying and characterizing SARS coronavirus strain in a fast and efficient fashion using short and/or incomplete segments of nucleotide sequences. Potentially, the need for sequence assembly can be circumvented. NoteThe main ideas and results of this research were first presented at the International Conference on Nonlinear Systems and Dynamics (CNSD-2013) held at Indian Institute of Technology, Indore, December 12, 2013. In this manuscript, we have extended our preliminary analysis to include SARS-CoV-2 virus as well.
Kumar, A.; Mohammad Khan, F.; Gupta, R.; Puppala, H.
Show abstract
The outbreak of COVID-19 is first identified in China, which later spread to various parts of the globe and was pronounced pandemic by the World Health Organization (WHO). The disease of transmissible person-to-person pneumonia caused by the extreme acute respiratory coronavirus 2 syndrome (SARS-COV-2, also known as COVID-19), has sparked a global warning. Thermal screening, quarantining, and later lockdown were methods employed by various nations to contain the spread of the virus. Though exercising various possible plans to contain the spread help in mitigating the effect of COVID-19, projecting the rise and preparing to face the crisis would help in minimizing the effect. In the scenario, this study attempts to use Machine Learning tools to forecast the possible rise in the number of cases by considering the data of daily new cases. To capture the uncertainty, three different techniques: (i) Decision Tree algorithm, (ii) Support Vector Machine algorithm, and (iii) Gaussian process regression are used to project the data and capture the possible deviation. Based on the projection of new cases, recovered cases, deceased cases, medical facilities, population density, number of tests conducted, and facilities of services, are considered to define the criticality index (CI). CI is used to classify all the districts of the country in the regions of high risk, low risk, and moderate risk. An online dashpot is created, which updates the data on daily bases for the next four weeks. The prospective suggestions of this study would aid in planning the strategies to apply the lockdown/ any other plan for any country, which can take other parameters to define the CI.
Chattopadhyay, R.; Surendran, D.; S, L.; Guhathakurta, P.; Hosaliker, K. S.; Pai, D.; M, M.; Mohapatra, M.
Show abstract
Modelling the dynamics of mosquito borne disease (MBD) cases is a challenging task. The current study first proposes a generic dynamical model to qualitatively understand the seasonality as well as outbreaks of malaria and dengue over the state of Kerala based on a climate forced oscillator model, which is then supplemented by a data driven model for quantitative evaluation. The proposed forced oscillator model is parametric and general in nature which can be qualitatively used to understand the seasonality and outbreaks. However, since parametric model-based estimation require estimation of multiple parameters and several closure assumptions, we used the K-means clustering which is a data driven clustering approach to understand the relationship between Malaria and Dengue cases and climate forcing. The results showed a clear relationship of the MBD cases with the first order and second order moments (i.e. mean and standard deviation) of the climate forcing parameters. Based on this, we came up with an objective threshold criterion which relates the climate parameters to the number of cases of malaria and dengue cases over Kerala.
Ananthakrishna, G.; Kumar, J.
Show abstract
We introduce a deterministic model that partitions the total population into the susceptible, infected, quarantined, and those traced after exposure, the recovered and the deceased. We hypothesize accessible population for transmission of the disease to be a small fraction of the total population, for instance when interventions are in force. This hypothesis, together with the structure of the set of coupled nonlinear ordinary differential equations for the populations, allows us to decouple the equations into just two equations. This further reduces to a logistic type of equation for the total infected population. The equation can be solved analytically and therefore allows for a clear interpretation of the growth and inhibiting factors in terms of the parameters in the full model. The validity of the accessible population hypothesis and the efficacy of the reduced logistic model is demonstrated by the ease of fitting the United Kingdom data for the cumulative infected and daily new infected cases. The model can also be used to forecast further progression of the disease. In an effort to find optimized parameter values compatible with the United Kingdom coronavirus data, we first determine the relative importance of the various transition rates participating in the original model. Using this we show that the original model equations provide a very good fit with the United Kingdom data for the cumulative number of infections and the daily new cases. The fact that the model calculated daily new cases exhibits a turning point, suggests the beginning of a slow-down in the spread of infections. However, since the rate of slowing down beyond the turning point is small, the cumulative number of infections is likely to saturate to about 3.52 x 105 around late July, provided the lock-down conditions continue to prevail. Noting that the fit obtained from the reduced logistic equation is comparable to that with the full model equations, the underlying causes for the limited forecasting ability of the reduced logistic equation are elucidated. The model and the procedure adopted here are expected to be useful in fitting the data for other countries and in forecasting the progression of the disease.
Musa, R.; Ezugwu, A. E.; Mbah, G. C.
Show abstract
The novel coronal virus has spread across more than 213 countries within the space of six months causing devastating public health hazard and monumental economic loss. In the absence of clinically approved pharmaceutical intervention, attentions are shifted to non-pharmaceutical controls to mitigate the burden of the novel pandemic. In this regard, a ten mutually exclusive compartmental mathematical model is developed to investigate possible effects of both pharmaceutical and non-pharmaceutical controls incorporating both private and governments quarantine and treatments. Several reproduction numbers were calculated and used to determine the impact of both control measures as well as projected benefits of social distancing, treatments and vaccination. We investigate and compare the possible impact of social distancing incorporating different levels of vaccination, with vaccination programme incorporating different levels of treatment. Using the officially published South African COVID-19 data, the numerical simulation shows that the community reproduction threshold will be 30 when there is no social distancing but will drastically reduced to 5 (about 83% reduction) when social distancing is enforced. Furthermore, when there is vaccination with perfect efficacy, the community reproduction threshold will be 4 which increases to 12 (about 67% increment) with-out vaccination. We also established that the implementation of both interventions is enough to curtail the spread of COVID-19 pandemic in South Africa which is in confirmation with the recommendation of the world health organization.
Ruzhansky, M.; Tokmagambetov, N.; Torebek, B.
Show abstract
We consider a simple model for the COVID-19 pandemic to analyse the relative effectiveness of several stages of the lockdown in Belgium, as well as of several phases of its relaxation. We also make a future projection of different types of measures relative to different stages of the already experienced lockdown.
Garcia Islas, E. I.; De Anda Jauregui, G.; Salas Rodriguez, J.; Serrania Soto, F.
Show abstract
In terms of the number of fatalities, Mexico has been one of the countries most affected worldwide by the pandemic. Using different Machine Learning techniques, some of the first cases of the infection registered in Mexico City (CDMX), the geographical and political center of the country, are analyzed in order to determine the causes of lethality and evolution of infection by the SARS-CoV-2 virus, from April 1 to September 27, 2020 in workers of the Capital Metro.
Lee, S. J.; Durant, T. J.; Dudgeon, S.; Nelson, B.; Young, P.; Horn, G.; Schulz, W. L.
Show abstract
Model tuning with the optimization of pipeline configuration is a well-established practice for the development of machine learning models. However, this often entails an exhaustive search process, especially as the parameter space expands with increasing model complexity. In the emerging field of quantum machine learning (QML), there is limited literature on the effects of configuration parameters, especially quantum-specific ones, and their choices on model performance. To address this gap, here we present a study exploring the impacts of data scaling and configuration parameters in quantum neural network (QNN) development using beta regression. Our experiments with two benchmark datasets showed that a well-tuned QNN can achieve predictive performance comparable to its classical counterparts. Our findings also demonstrate useful reference points of QNN model tuning to support a more efficient parameter optimization process.
Nikitenkova, S.; Kovriguine, D. A.
Show abstract
We have detected a regular component of the monitoring error of officially registered total cases of the spread of the current pandemic. This regular error component explains the reason for the failure of a priori mathematical modelling of probable epidemic events in different countries of the world. Processing statistical data of countries that have reached an epidemic peak has shown that this regular monitoring obeys a simple analytical regularity which allows us to answer the question: is this or that country that has already passed the threshold of the epidemic close to its peak or is still far from it?
Punn, N. S.; Sonbhadra, S. K.; Agarwal, S.
Show abstract
The catastrophic outbreak of Severe Acute Respiratory Syndrome - Coronavirus (SARS-CoV-2) also known as COVID-2019 has brought the worldwide threat to the living society. The whole world is putting incredible efforts to fight against the spread of this deadly disease in terms of infrastructure, finance, data sources, protective gears, life-risk treatments and several other resources. The artificial intelligence researchers are focusing their expertise knowledge to develop mathematical models for analyzing this epidemic situation using nationwide shared data. To contribute towards the well-being of living society, this article proposes to utilize the machine learning and deep learning models with the aim for understanding its everyday exponential behaviour along with the prediction of future reachability of the COVID-2019 across the nations by utilizing the real-time information from the Johns Hopkins dashboard.
Scholz, E.; Kreck, M.
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWIn an earlier paper we proposed a recursive model for epidemics; in the present paper we generalize this model to include the asymptomatic or unrecorded symptomatic people, which we call dark people (dark sector). We call this the SEPARd-model. A delay differential equation version of the model is added; it allows a better comparison to other models. We carry this out by a comparison with the classical SIR model and indicate why we believe that the SEPARd model may work better for Covid-19 than other approaches. In the second part of the paper we explain how to deal with the data provided by the JHU, in particular we explain how to derive central model parameters from the data. Other parameters, like the size of the dark sector, are less accessible and have to be estimated more roughly, at best by results of representative serological studies which are accessible, however, only for a few countries. We start our country studies with Switzerland where such data are available. Then we apply the model to a collection of other countries, three European ones (Germany, France, Sweden), the three most stricken countries from three other continents (USA, Brazil, India). Finally we show that even the aggregated world data can be well represented by our approach. At the end of the paper we discuss the use of the model. Perhaps the most striking application is that it allows a quantitative analysis of the influence of the time until people are sent to quarantine or hospital. This suggests that imposing means to shorten this time is a powerful tool to flatten the curves.