One proposal to fix the HAM-D was presented at the NCDEU by Amir Kalali, M.D., and colleagues of the International Society for CNS Drug Development (ISCDD) Depression Rating Scale Standardization Team (DRSST). Their GRID-HAMD, available online at
"Our goal for the DRSST process was not to change the items but to improve the current scale as much as possible by standardization, operationalization of concepts, better anchors and structured questions, as well as conventions," Kalali told Psychiatric Times.
Bagby and colleagues (2004) considered the GRID-HAMD, but did not find it sufficiently improved nor to be a suitable replacement, as it retains the original 17 items. "The ... Team failed to address many of the flaws of the original instrument," they asserted. "Most of the items still measure multiple constructs; items that have consistently been shown to be ineffective have been retained, and the scoring system still includes differential weighting of items."
Kalali explained to PT that it was not the intention of the DRSST to replace the HAM-D with an entirely different scale, but to offer an improvement until such a replacement could be developed. "We agree that the Hamilton Depression Rating Scale is flawed," he remarked. "However, while we await other scales to be developed and widely accepted, it is a practical reality that the Hamilton Depression Rating Scale will continue to be widely used in both regulatory and academic trials for at least a few years."
Kalali characterized the DRSST effort as a demonstration project for the ISCDD collaboration between pharmaceutical manufacturers and academic centers. He anticipates that a new ISCDD-funded initiative, the Depression Inventory Development Team, which will include Bagby, will produce a widely accepted depression scale which incorporates appropriate, data-driven items, consistent item response, and is sensitive to contemporary depression management interventions.
Bagby and colleagues (2004) derived their unfavorable assessment of the HAM-D-17 from 70 studies that examined one or more of the psychometric properties of reliability, item response and validity. Reliability was evaluated as internal reliability between instrument items, in retesting and when applied by different raters. Item response analysis ascertained sensitivity to different levels of, and changes in, symptom severity.
Studies examined content, convergent, discriminant, factorial and predictive validity. Content validity reflects the scale items corresponding with known factors of depression. Convergent validity is the correlation with other measures of depression. Discriminant validity corresponds to distinguishing between groups with and without depression. Factorial validity is derived from factor analysis of the empirical structure of the scale, ascertaining whether each item loads on the factor for which it was designed. Predictive validity occurs in the prediction of change in symptom severity with treatment.
Bagby and colleagues (2004) found agreement in studies that the majority of HAM-D items have adequate internal reliability; although loss of insight had the most variable rankings. Retest reliability at the item level was negligible for some items, but this varied considerably across studies and was enhanced with use of structured interview guides.