This paper is concerned with the structural reliability of conjoint measurement when applied in a health care setting. The clinical context was the diagnosis and treatment of knee injuries. A conjoint measurement study was conducted which used the pairwise choice approach to preference elicitation. Each choice included two scenarios: a conventional treatment approach to management (arthroscopy) and an approach using magnetic resonance imaging. In order to test for structural reliability two separate conjoint measurement exercises were conducted: exercise A where scenarios were defined in terms of three attributes and exercise B where scenarios included all four attributes. The assessment of structural reliability involved a comparison of two random effects probit models, for exercises A and B. Data were collected on a total of 176 students of sports science. The results strongly indicate that the models for the two exercises are different, although the instability is limited to the constant term and a single model attribute (i.e. the avoidance of surgery). The finding of instability in the constant coefficient raises important questions about the appropriateness of labelling scenarios in conjoint measurement exercises.