Repeated locomotion scoring of a sow herd to measure lameness: consistency over time, the effect of sow characteristics and inter-observer reliability
Investigating variability of scores between different observers, between animals and over time aids the design of valid sampling methodologies for measuring animal welfare. Locomotion scores (0 to 5 scale) were collected: i) from 154 sows in one herd, using three to five observers each
time, and scoring sows on up to ten occasions over a 19-month period; and ii) for 123 of these sows, locomotion scoring also took place prior to farrowing and at weaning. The distribution of scores was highly skewed towards low scores (0: 84.8%, 1: 9.5%, 2: 4.0%, 3+: 1.7%). Sows showed moderate
consistency in their scores over time and later parity sows had higher scores, but there was no effect of stage in the reproductive cycle (days pregnant, pre-farrowing, post-weaning). This suggests that infrequent visits to a farm (eg annual) might provide an accurate estimate of the extent
of lameness if a representative range of parities was sampled, although a larger study of more farms would be required to investigate this. The three different types of agreement between observers (absolute differences, matching and association) were assessed as follows: i) analysis of absolute
differences between observers showed that the farm manager scored lower than researchers/technicians; ii) exact matching approaches suggested fair or good agreement — agreement was poorest for mild gait abnormalities (score 1 'stiff'), and agreement improved if scores were combined into
'sound' (0–1) and 'lame' (2–5) categories; and iii) measures of association suggested moderate agreement. Inter-observer reliability improved over time until the 5th scoring event. To improve inter-observer agreement, observer training/practice and the use of fewer categories are
recommended, and inter-observer agreement should be checked regularly.