Fitting Markov chain models to discrete state series such as DNA sequences
Discrete state series such as DNA sequences can often be modelled by Markov chains. The analysis of such series is discussed in the context of log-linear models. The data produce contingency tables with similar margins due to the dependence of the observations. However, despite the unusual structure of the tables, the analysis is equivalent to that for data from multinomial sampling. The reason why the standard number of degrees of freedom is correct is explained by using theoretical arguments and the asymptotic distribution of the deviance is verified empirically. Problems involved with fitting high order Markov chain models, such as reduced power and computational expense, are also discussed.