Using partial components to restore and use the concurrent validity of the Index of Readiness

In the presence of correlations among the dimensions of psychometric tests with summated scales, it is sometimes difficult to use the scores on the dimensions to predict their effects on various responses of interest through ordinary or generalized regression models, which can serve as concurrent validations. We will use the Index of Readiness (IR) as a case study to describe a statistical procedure to address this problem. Our solution will allow us to propose an optimal strategy of care to increase the adherence of HIV patients to treatments, as measured by a health indicator, by improving their readiness.

In the presence of correlations among the dimensions of psychometric tests with summated scales, it is sometimes difficult to use the scores on the dimensions to predict their effects on various responses of interest through ordinary or generalized regression models, which can serve as concurrent validations. We will use the Index of Readiness (IR) as a case study to describe a statistical procedure to address this problem. Our solution will allow us to propose an optimal strategy of care to increase the adherence of HIV patients to treatments, as measured by a health indicator, by improving their readiness.
Even on established and well validated multidimensional scales the psychometric properties on samples other than the original or princeps sample are sometimes difficult to ascertain (DeVellis, 2011;McIntire &al., 2010;Spector, 1992).
Moreover, in the presence of correlated dimensions, when one tries to put into action a certain concurrent or criterion based validity on an external measurement, small coefficients of determination of regression models will prevent the determination and validation of any expected valid regression models for a criterion.
Even more troublesome, the usual valid indicators of reliability or internal validity of the scales are not a guaranty for the unidimensionality of any given presumed dimension.
The problems are exacerbated when used on small sample sizes which are numerous in the health sciences. In that case it is more difficult or even impossible to use validation techniques such as Confirmatory factor analysis (CFA), or Structural equations models (SEM).
This is exactly what is observed here on the Index of Readiness scale (IR), where one would like to predict the adherence of HIV patients to their treatments, as measured by their CD4 counts, and if possible use their readiness characteristics to increase this indicator of adherence.
In the following, we will describe the IR scale and our response variable, the CD4 counts of the patients (section 2), exhibit the problem of concurrent or predictive validity for the CD4 counts (section 3), describe a statistical methodology to get a better description of the readiness of a patient according to the IR scale, which restores this scale's predictive validity (sections 4 & 4.1).
Using the results obtained in the first parts of section 4, we end the section with a proposition for an intervention designed with the objective of increasing the CD4 counts of HIV patients in mind, and make predictions for the effects on CD4 counts of any intervention based on the proposed lines (section 4.2).
Finally we close this report by a discussion (section 5), with an emphasis on the various validations performed throughout the research, and a brief conclusion (section 6) pointing to further research. 1

Description of the Index of Readiness (IR), and our response variable
Successful treatment of HIV patients depends on their ability to maintain nearly perfect adherence to combinations of antiretroviral medications (Moyle, 2000). Countering nonadherence is considered one of the greatest challenges in the medical management of HIV disease: substantial numbers of HIV-positive individuals are currently failing treatment due to non adherence (Fogarty, &al., 2002;Sankur &al, 2002). It has been shown that taking less than 95% of prescribed antiretroviral medication doses leads to drug resistance and subsequent treatment failure (Low-Beer &al., 2000). In this context, it is the general consensus among HIV care providers that treatment of HIV infection should be delayed until an individual is "ready" to adhere to antiretroviral medications (Cheever, 2000;Noring &al., 2001;Sorenson &al., 1998).
Fleury's wellness motivation theory (Derenovsky-Fleury, 1991) provides a framework for understanding the process of initiating and maintaining the action of health behavior change. Readiness, defined as a conscious awareness that a particular behavior is desirable and beneficial (Fowler, 1998), is the first stage in the process and occurs prior to the change. It was hypothesized, based on Fleury's theory, that 1 A more complete text including the questioning that gave rise to the development described here, which can serve as a course document or tutorial, as well as the data file can be found at the following address: http://wikistat.polymtl.ca/tikiindex.php?page=Psychom%C3%A9trie+et+statistique. individuals with higher readiness levels at baseline would be more likely to adhere to their anti-HIV medication regimens. In addition, based on her theory, Fleury (1994) provided a measure of the concept, the Index of Readiness scale (IR) one of the rare instruments measuring readiness to initiate health behavior change. Our aim in the study was, after the establishment of a valid translation in French of the original English version of the IR, to examine the effects of readiness and its various dimensions on a commonly admitted response measurement for adherence to treatment, the rate of CD4 cells in the blood (Vedhara, 1997), that should serve as a concurrent validation of the IR scale. 2 The three dimensions of readiness developed in the 30 items of the IR scale are: (1) Revaluation of Lifestyle ('Reval') with 11 averaged items; (2) Identification of barriers/Creating strategies ('Ident'), with 12 averaged items; and finally (3) Goal commitment ('Commit') with 7 averaged items. We can calculate a total readiness index (TotalRead) by averaging the 30 items. All the items are measured on a Likert scale of 5 intensities, from 1: 'Strongly disagree' to 5: 'strongly agree'.
The scale's items were translated into French and the translation was validated according to the usual Brislin procedure (Brislin, 1970;Varricchio, 1997): the translation was followed by a retro-translation back into English by a different and independent translator, with the discrepancies resolved for the French translation by a professional linguist to adjust for the level of language.
The French version was then administered to N=105 HIV patients twice, at two different times separated by a six month period. 3 As far as can be observed from the figures of Tables 1a & 2, the French translation of the test has very similar psychometric properties as the ones reported in the literature for the original English version. We report testretest figures (Table1b), a first for the IR. Table 1 provides the main usual internal consistency indicators for the dimensions of the scale: the two times allow for a test-retest correlation (or reliability), which are quite high. And the Cronbach's alphas for Time 1 are well within the usual acceptable values in view of the commonly accepted, but empirical, criterion (Loewenthal, 1996;Carmines & Zeller, 1979). The figures are very similar to the ones reported for the English version of the scale.
The measure of adherence to HIV treatment (a combination of antiretroviral medications) considered in this According to Enriquez &al. (2004), who used the IR scale on a sample of N=19 patients who had previously failed treatment due to non-adherence, fairly strong correlations were observed between the 3 dimensions: they reported their sample correlations between 0,61 and 0,70. We observed, unsurprisingly, inter-dimensional sample correlations between 0,74 and 0,77 among our N=104 patients (see Table 2).
One also notices (Table 2) a total lack of correlations between the three dimensions and the total readiness with the CD4 responses. Some correlations with the CD4 counts are even negative contrary to common sense (the CD4 counts should increase with the IR dimension scores), but still not significantly correlated. Very strong correlations are naturally observed for the three dimensions with the total readiness index which is a weighted average of the three.
An examination of the scatterplots with their natural linear models associated to the correlations, for all pairs of these variables, shows that the correlations relate, as they should, to correct simple linear models. An examination of the data has also put the light on an incorrect value for the CD4 value subject n=74. It was wrongly recorded at 0. It was decided then to declare missing the CD4 value for subject n=74. 5 From Table 2 and the scatterplots of the pairwise dimensions and CD4 (not shown here), it is quite evident that using the results of the IR scale the more efficiently possible to increase the CD4 measurements will be problematic, to say the least, whether one wishes to rely on ordinary regression models or on more sophisticated linear 4 We will consider in this article only the first time of measurement for the IR scale, as well as for the CD4 counts. Some of the exercises suggested for the tutorial complement of this report will concern the measurements at the second point in time. We observe very few differences. 5 And for analyses where CD4 is one of the variables involved, the dataset will be reduced by one, since subject n=74 will be taken out of the sample. models. We will see that the problems come mainly from the high correlations among the dimensions in any model of the type: (Reval, Ident, Commit) CD4.
The factors of influence (either from the averaged form for the dimensions, or from non-summated item responses) are too uncorrelated to the response to bring any inference results for the models. This is explained in the next section.
The purpose of our study is to describe and use optimally the influence of readiness, measured by the IR scale, to increase the CD4 counts through an appropriate patient stimulation.
First of all, we should mention that the other social variables observed in the sample, age, level of education, etc., have no bearing on the CD4s or the other variables. They will not be part of any model as co-variables.

Problems in the use of the IR scale for predictions of the response CD4
In this section, we report the impossibility of realizing our objective through linear models on the response, the CD4 counts, with the IR dimensions or the total readiness as the factors of influence. This expected concurrent validity is in defect. Table 2 shows that the correlations of either of the three dimensions of the IR or its total score with the response, the CD4s, being close to zero, it will be impossible to ascertain a linear influence of any of the dimensions of the IR scale on the CD4s, taken one at a time. What about multiple linear models?
The multiple regression model (Reval, Ident, commit) CD4, confirms that no linear influence of the dimensions on the response can be confirmed (Table 3). Even worse, the sign of the Ident and Reval coefficients are wrong: greater Ident and Reval values should give rise to greater CD4 counts, not less as determined by the model.
The residuals (not reported here) show no pattern referring to an incorrect model: the linear model is correct but with a very small R 2 .
An examination of the residuals shows 2 subjects outside gives any reason to suspect that something is wrong with their data: subject n=19 has the larger CD4 count at 899; and 34 for subject n=60 (mean 440,9 with a standard deviation of 161) is the smaller value. Rare events do occur: these facts do not constitute a good reason to suppress these two subjects from the regression analyses, contrary to what was decided for n=74, who, except for his incorrect CD4 count looks otherwise perfectly 'normal', and who does not appear in the analysis on account of his missing value for CD4. We could however trim the data of its extreme CD4 values to get more robust conclusions. No significant difference is noted doing so. A careful analysis of possible causes of this counterintuitive regression results points to multicollinearities among the predictors, the dimensions of the scale. High correlations among pairs of predictors tend to increase the variances of the estimated regression coefficients. Useful measures of the increased variances are the VIFs (the Variance Inflation Factors), defined for each of the k predictors (Neter &al., 1996): where Rk 2 is the coefficient of multiple determination when the k th predictor is regressed on the other predictors. It is readily seen that large Rk 2 , i.e. close to 1, indicating a strong linear link between the k th predictor variable and the other ones, will lead to a large VIF(k).
A commonly used measure of multicollinearity is to take the average of the individual VIF(k)s, denoted by VIF, for a given regression model. A small VIF, i.e. close to 1 (they are always larger than 1), indicates that all the Rk 2 are close to 0, i.e. signaling no linear links between all the predictor variables.
How large should a VIF be to signal a multicollinearity problem? It depends on R 2 , the general coefficient of determination of the model examined. With a large R 2 , the individual VIF(k) need to be large to prevent a predictor from being considered significant because the larger resulting variance of its coefficient in the adjusted model prevents it to be considered non zero.
But with a small R 2 , all the estimated coefficients of the predictors in the model tend to be close to the level of absence of significance, even in the case of low multicollinearities between the predictors: significant regression coefficients can wrongly escape significance. Indeed, even small multicollinearity problems, thus small VIFs, with the resulting larger variances for some or all coefficients will tend, wrongly, to declare their noninfluence on the response.
To put the multicollinearity problem in another light, even when an influence of a predictor is deemed significant in a model, is it because of its own intrinsic influence or because the actual effect comes from another predictor strongly correlated to it? This effective predictor may not even be included in the model thus giving rise to a wrongly Table 3. Main regression results for the model : (Reval, Ident, Commit) CD4. The coefficient of multiple determination is R 2 = 0,04. A careful analysis of regression is a long procedure, much of which is not described here, with screening of the variables, of the subjects, careful considerations on the diagnostics obtained, etc. For details see Neter &al. (1996). The usual basic study is reported here.

SS
specified model, or included in it, with its multicollinearity properties with the other predictors that give rise to wrong conclusions as to its significance? Multiple regression models in the social and health sciences with their frequent low R 2 should be validated with great care.
To complete the discussion, here are the three VIFs: VIF(Reval) = 3,29; VIF(Ident) = 1,91; VIF(Commit) = 2,74; for a global VIF = 2,65. This looks quite small according to the alerting values from Neter &al. (1996), but, as explained, when the coefficient of determination R 2 is small, even small VIFs can be large enough so as not to allow, perhaps wrongly, the rejection of the null hypothesis for the individual coefficients. Their p-values are here much too large to provide any clue for the decomposition of the influence of the IR dimensions on the CD4s -and surely, no clues for the influence of the IR items themselves, which is what is sought after in the end. Correlations in the vicinity of 0,70 between the predictors may be enough to mask the phenomena in action, if any.
Furthermore, in view of our relatively small sample, it is of no use to try more sophisticated models, like structural equations or latent variables models with the item scores as primary observed variables.
After an initial validation and construction of a summated scale where large samples are needed, the final constructed scales are usually used on much smaller samples, oftentimes in the vicinity of one hundred subjects or so, precluding the use of sophisticated linear models. 6 Moreover, even when the correlations between the dimensions on the validation or princeps sample are small and the dimensions nearly orthogonal (or independent), the 'secondary' samples have rarely this property.
Should one wants to use a scale for descriptive purposes only, this could not be of grave consequence. But for inferential matters (criterion based validations for example), with the use of modeling techniques then, it is an altogether different matter. The existence of approximately independent dimensions is of crucial importance, in the context of rather small effects (small R 2 s).
In order to use this study as a tool for designing better care for HIV patients, that is care that will hopefully optimize their CD4s, we have established a procedure to restore the crucial psychometric property of the independence of the dimensions, allowing the determination of useful models for the effect of the readiness properties on the CD4 counts. This, in theory, will allow more efficient care, a better adherence to the VIF treatments.

A statistical procedure to obviate the difficulties
Fairly new in the context of scale validation in the eighties (Carmines & Zeller, 1979), PCA is now a standard staple in the construction and use of tests for validation purposes, whether on the princeps sample, or on the applications of the scales on much smaller samples (DeVellis, 2011;McIntire al., 2010). This was the case with the IR scale (Fleury, 1991(Fleury, & 1994 that heavily relied on what is called nowadays 'Exploratory factor analyses'. 'Confirmatory factor analysis' (CFA), and a whole set of newer developments of structural models, were designed to confirm (or not), or even determine a model on the princeps dataset, or with other datasets (e.g. Probst, 2003 for an excellent case study ; Tabachnik & Fidell, 2007). Usually their use can only be advised for large datasets (usually comprising several hundred subjects), because so many variables (items) are observed, and the structures constitute elaborate models, e.g. causal models, with conceptualized latent variables.
Remedial measures for the multicollinearity problems from CFA methodologies are impossible for our study due to its small sample size. 7 It is far from evident that the psychometric structure of any given scale is preserved for all its applications, especially concerning the independence of the conceptualized dimensions. Validation and construction is one thing, confirmation on other datasets is an altogether another matter.
How to improve, on our dataset, the independence structure of the dimensions defined in the IR scale is the object of the next section.

A novel procedure, close to the scale's structure: partial principal component analysis (P-PCA)
We will assume the original scale is well defined, well construed and well validated, that the items selected for each of the dimension are in reality strongly correlated within their own dimension (i.e. large coefficient alpha), and with low correlations with the items from the other dimensions. The dimensions are supposed to be independent, but this is impossible to ascertain for our application.
And so, the multicollinearities across dimensions are supposed to be an artifact of our sample. Actually, this seems to be a quite frequent situation outside the princeps samples, and so there is a need for a general procedure, firstly to observe the structure of the items from each dimension on a given sample, and from there go a step beyond, put into action a predicting model, and finally draw practical consequences from the model.
Assuming then the correctness of the split for the items into the 3 dimensions, a principal component analysis on the items would, in theory, give only one significant component per dimension: loadings would be large for each dimension's items on a unique component, and close to zero for the other dimensions' items (Carmines & Zeller, 1979). Furthermore, foreign items from each dimension would have almost no correlation outside its own component.
It is not the case here: we observe more than one component deemed to be significant according to the usual criteria, chiefly the Kaiser and the scree test (Sharma, 1996;Tabachnik & Fidell, 2007), for each set of items isolated according to their dimension. The important property, as in any PCA: the components defined for each set of items are orthogonal, i.e. their correlations are pairwise zero. But the loadings for each item are large for many components. The correlation structure is very intricate.
Our procedure consists then of using PCA analyses on the subset of items defining a dimension, and that for each dimension. We define a sort of partial principal component analysis (P-PCA) on our dataset, in the following way: 1 -For each set of items of a given dimension of the scale we perform a principal component analysis. For each dimension we obtain its significant principal components from the usual criteria. We call the significant partial components pseudo-items for its dimension. The items from each dimension define the correlation structure of the pseudo-items.
2 -We then project all the subjects on this new reduced set of components for the said dimension: the values of the pseudo-items for a given subject are its projection values. When this is done for all the dimensions we get a new and much reduced set of pseudo-items representing the data: the partial components, with all their values for the subjects, separated in the 3 postulated dimensions. And a new correlation structure among them.
Orthogonality is naturally obtained within dimensions, 8 what is interesting is the structure across dimensions. We will not report results based on rotations, even though we have extensively explored many different rotations to no avail. The structure described here is roughly optimal and does not change much with rotations.
A look at Table 4 should make things clear enough where the P-PCA procedure will show itself in a clearer light.
What is interesting to observe in the components is that all inter-dimensional correlations among the partial components (pseudo-items) are reduced compared to the original dimensions and items.
Another remarkable thing is that there are partial components inter-correlations very close to 0. In Table 4, we have signaled (with a bold typeface), the 3 partial components with their mutual correlations closest to 0, among a certain number of other possibilities.
Finally one can observe that with sample size of N=104 subjects, the only significantly non-zero correlations of the pseudo-components and the CD4s are with the component obtained through the 'Identify the barriers' (Ident) items.
The advantage of the P-PCA procedure is that it is very respectful of the assumed valid structure of the original items aggregated in their dimensions.
But the procedure does look somewhat contrived. Some sort of validation is then in order. What if our general hypothesis is false and that the scale measures what it purports to measure is not valid?
The validation strategy that we propose here would serve as a confirmation of this fundamental hypothesis. One can never say that a scale is true to its purpose (here measuring readiness for the Aids therapeutic stringent requirements). But after the validations described in the next section, one feels reassured.
On Table 4, one observes the correlations of the three pseudo-items for each dimension denoted 'RevPC 1', 'RevPC 2', 'RevPC 3' for the Revaluation dimension (respectively for the 'IdCPs' for the 'Identify barriers' dimension and 'ComCPs' for the Commitment dimension). The matrices of intra-dimensional correlations are the identity matrices (this is a property of PCs). The other matrices are for inter-dimensional correlations. Also included in Table 4 are the correlations of these 9 pseudoitems with the CD4s. The boldtype cells are three inter-correlations close to 0, i.e. close to independence. In the CD4 column and in bold typeface are signaled the only significant correlations between the partial components with CD4.
Validations. Are the properties reported in Table 4 an artifact of the P-PCA procedure? To test this possibility, instead of using the original split of the scale's 30 items for each dimension, we have randomly split them into 3 groups in the same proportions as for the original split. And then we create P-PCA components for the 3 ersatz dimensions and check the analog of Table 4. And repeat this random experiment a large number of times.
A typical analog of Table 4 is reported in Table 5. We readily observe that the main property of Table 4 is not observed in Table 5. It is impossible to the find a noncorrelated threesome of components, one from each ersatz of dimension. Notice that there is one independent pair: PC11 & PC32. But there is no PC from the ersatz second dimension that is non-correlated to the two: we observe a correlation of PC32 & PC21 of 0,09; but PC21 has a correlation of 0,70 with PC11. Actually there is no marked and systematic reduction of inter-dimensional correlations.
None of our simulated dimensions exhibit the marked orthogonality property sought after, and obtained naturally with the original split.
It could be possible to produce automatic validations of this sort in great numbers. We have simply tried a couple of dozens and none of them give results somewhere close to the ones observed in Table 4. And even those with the compositions of the ersatz dimensions fairly close to the original ones failed to produce non correlated threesomes.
Only the original split of the 30 items in the 3 assumed dimensions restored a somewhat independence of the P-  PCAs. Even though this is a heuristic validation process, with no quantitative measures involved, we would hypothesize that the behavior observed is a sort of validation for the original split of the items into their dimensions.

Optimizing the CD4 response
Now we can return to our main objective. Is there a way to stimulate the readiness for a HIV patient so that it could bring an increase in the CD4 counts? More to the point, can we detect the items of the original scale related to high CD4 counts, so that increasing the values of some of them and decreasing the values of others would produce such an effect?
Since all the P-PCs have pairwise much reduced correlations (with few exceptions), compared to the original dimensions and with much less VIFs (compared to the general R 2 s of the adjusted models), we will now regress CD4 counts on the 9 P-PCs. Then we will go back to the correlation structure of the items on the partial principal components that influence the CD4 counts, to single out the most influential original items, and finally we will show how, and by how much it would eventually be possible to increase the CD4 counts of subjects by a carefully designed use of this information.
And so, as a first step, we will use the regression models with all the assumed 9 pseudo-items as predictors of CD4, and then we will use a forward stepwise regression procedure to single out the most interesting ones.
We have screened the subjects for possible outliers to make sure the results obtained are robust: no subject seems to change much the conclusion retained from the regression models.
Finally we will use the original item decomposition (loadings) on the P-PCs retained by the preceding procedure to infer conclusions as to what could be done to increase the CD4 counts for patients.
A final remark is in order before proceeding. As noticed in Table 2, the negative correlation of the CD4s, even though considered non-significant with the three dimensions, Reval, Ident & Commit, are counterintuitive: the higher the value of any dimension, the higher the CD4s should be, since they should have a positive effect on the response, the CD4 counts. This basically has been the start of this study.
But for the composite partial components, the pseudoitems, there is no prescribed sign for their correlations with the CD4 counts, in general for any correlation with outside variables/indicators. The sign of their action on CD4, if any, depends on their decomposition from the original IR items. Principal components are defined by the opposition and the signs of the large correlation (loadings) of the items that constitute them. We will see this in action in the following, since the bottom line of the study will be prescriptions for what should be done to increase the CD4 counts, what IR Results for linear models with the nine P-PCAs. First of all, as said, there is no reason why we should not try all the P-PCAs, or pseudo-items, as predictors to figure out their influence on the response variable CD4. They all represent the three dimensions and they are orthogonal within their own dimension. Table 6 reports the main numerical results. Only one P-PCA, IdPC-3, is deemed to have an influence on the response variable in the presence of others.
The residual diagnostics of the model show no particular problems, no outliers, etc. However it is sometimes convenient to check the model validity by trimming the dataset, that is eliminate the subjects with the extreme values of the response that can have a large and undue influence on the results through standard diagnostics measures for influence (Mahalanobis and Cook's distance, deleted residuals).
If we trim the dataset for the two extremes, n=19 and n=60 of the CD4s, they are also the subjects with the extreme values of the diagnostics measures of influence, we get essentially the same results on the so-called trimmed dataset, but with an R 2 = 0,10, and a global p-value of 0,36, hardly an improvement.
The last model reported here (Table 7), among the many examined are obtained from the 9 P-PCs through the forward stepwise procedure for the selection of the active predictors.
For the P-PCs determination we work with the full dataset, since the only possible outliers (as verified at the beginning of this study, and also in the preceding regression analyses) stem from the CD4 values. But subject n=74 is a 9 For more information on the nature and interpretation of components, see Sharma (1996, pp.68-71), or DeVellis (2011). missing value for the regression models.
Again, one can observe the model on the full dataset, or on the trimmed dataset. We report only the results for the trimmed dataset. Needless to say we have exercised great care in ascertaining that our model is perfectly valid (See Neter &al., 1996, for details).
So we can infer that only two pseudo-items have an influence on the CD4: IdPC-3 and ComPC-2. They are the active pseudo-items.

Interpreting the active P-PCs (pseudo-items) in terms of the original items.
In view of the regression models (Table  7), we report here the results only for the P-PC for the 'Identify the barriers' and for the 'Commitment' dimensions, since no P-PC from the Revaluation dimension is significant on our models.
From Tables 8 & 9, we get the following interpretation of the first components for both dimensions. 10 All the correlations with the items are of the same sign and large, indicating a general increase for the global scores (Identify & Commitment, respectively) along the first components: in the positive direction for Identify, and the negative one for Commitment. So the patients of the sample are ordered along the first components, with a general value for their respective dimensions. The two first pseudo-items, IdPC-1 and ComPC-1 are called for this reason scaling components. 11 The third component for Identification of problems and the second one for the commitment attitude are the only 10 The same interpretation applies also to RevPC-1 (not reported here). 11 Exactly like general indices calculated, for example, for the Consumer price Index (CPI), the correlations (loadings) are considered as weights for the items in the first components. See Sharma (1996) for further explanations. ones significantly related to CD4 (Table 7). So the subjects are ordered along these components in decreasing CD4s for IdPC-3 since its regression coefficient is negative and increasing CD4s for ComPC-2, its regression coefficient being positive. The other P-PCs for either dimension are of no interest here, and no clear interpretation for the CD4s can be obtained from them since only IdPC3 and ComPC2 are significant factors to explain the CD4 counts (Table 8). The effects of the original items on CD4. The two active P-PCs, IdPC-3 and ComPC-2, are loaded (correlated) with the original items (Table 8 & 9), so it is from their underlying structures that an interpretation for the IR items' roles has to be sought.
Not all items from each of these two dimensions play a role in the effect of the P-PCs on the values of CD4. Let us consider from the two active P-PCs only the items with sufficiently high correlations (loadings) with their respective components, the other ones play a much lesser role, actually negligible. Ad hoc cutoff points were fixed for the effect at roughly 0,30 for the absolute values of the correlations (Table 10). These items are called 'active' items for the active pseudo-items.
Since they are correlations, the structure of a PC is essentially defined by the oppositions of the loadings: for example in the case of IdPC-3 (Table 10, left), an increase of the value of Ready-17 for a subject would automatically entail a decrease in the value of Read-29, more or less marked, depending on the values of the loadings. And conversely. Determination of an optimal strategy for care improvement. Now we can put together the results from Tab. 7 & 10.
We will use the strategy of defining supplementary (or illustrative) variables and cases, i.e. cases (artificial or observed) and variables that have not been used for the PCA analyses, but for which their projections, that is their localization in the PC-planes, can be calculated. This will allow us to illustrate various conclusions that we draw from the analyses. Active cases and variables are used in the calculations. Non-active variables and cases in a PCA can serve very efficiently illustrative purposes (Lebart, &al, 1984;Saporta, 2006, chap. 7). They are called for that reason supplementary or illustrative variables (cases). We have defined 3 basic supplementary cases: 'Ave', whose value for all the items on the IR scale are their averages; 'Min' whose value for all items are put at 1, the minimum value for the Likert scale; and 'Max' whose values on all the items are at 5.
The illustrative or supplementary variable is CD4, the one we wish to maximize through a careful manipulation of the active IR items.
To increase the CD4 counts we need to induce a patient to be situated/moved on the partial component IdPC-3 to its lower end & for ComPC-2 to its higher end, because their regression coefficients are negative for the first and positive for the second (Table 8).
Contrary then to the intuitive strategy of increasing the values of all the items, it should be fairly obvious by now that increasing some will automatically decrease others for the CD4 counts, namely those pairs with opposed correlations signs. So the optimal strategy should be based on the careful analysis of the relationships among the items through P-PCAs.
In view of the respective correlations larger than their cutoff value the IR items of the 'Identify the barriers' dimension partial component IdPC-3, it now appears that we need to have as large as possible scores for items 10, 26 & 29, and as small as possible for items 13 & 17 (Table 11). The values of the other items have no importance for the maximization of CD4, in view of their low correlations with IdPC-3. Now, with a similar reasoning, from the high correlations (larger in absolute value than 0,30) of the items from the 'Commitment' dimension, and from their signs, we need to have values for items 12, 14 & 19 as large as possible, and as low as possible the items 3 & 5 (Table 12).
The same conclusions for the roles of the active items can visually be obtained from the graphs of Figure 1 with the help of the projections of the supplementary CD4. We only consider in these graphs the items with large coordinates on the IdPC-3 or ComPC-2 components. One must remember that an acute angle between two variable vectors signals a large positive correlation (this projected angle in any given PC-plane of representation is an approximation of the exact angle whose cosine is the correlation). For example the cosine of the angle between CD4 and Read-29 in the first of the two graphs in Figure 1 is a rough indication that their correlation is positive, and so an increase of Read-29 will result in an increase on CD4. Inversely, the obtuse angle between CD4 and Read-3 in the second graph of Figure 1 (almost a flat angle), indicates that we should seek a small value for Read-3 to increase CD4. We can draw similar conclusions for the optimal effects of the other singled out items on CD4.
So, following those prescriptions, summarized on Tables 11 & 12 or from the graphs in Figure 1, we define 6 other supplementary cases by pairs: (1) the 'Min-Max' are the Table 10. Table 8 & 9, with the only lines left chosen with correlations greater than, by convention, roughly 0,3 in absolute value, the other ones being much smaller. We get the following picture for the active items for the significant predictors.

Ready-28
Ready-29 -0,359 'Min' supplementary cases with the optimal choice of only the items singled out in IdPC-3 and ComPC-2, 'Min-Min' with the anti-optimal values for the items; (2) similarly 'Ave-Max' is the 'Ave' case with the optimal choices for the items, And so for all patients, their coordinate on the IdPC-3 or ComPC-2 pseudo-items will not change much except through a careful manipulation of very few items on the original IR scale.
A final remark: shifts from the other 'Identify the barriers' items according to the same line have been tested, but, sure enough in view of their lower correlations, give almost no change to the CD4 counts. The same remark applies to the items from the 'Commitment to goals' dimension: only the 'active' items are important to control. A strategy for the optimal care strategy. From the semantics of the items (Tables 12 & 13), it is now evident that the oppositions are between contemplation and action or decision. The more a patient is in the process of evaluating the less he is into a decision & action phase. In some sense he is less ready to change and be adherent to a difficult treatment. The optimal strategy, for an intervention on AIDS patients using the readiness items to optimize the CD4 counts, is to increase the value of the decision items from the 'Identify the barriers' and the 'Commitment' dimensions, which, automatically by the effect of the negative correlations, will lower the reflection, acting on the 2 gives optimal results: the motto of the designed intervention could be «Enough pondering & rumination, action! » The predicted effect of our optimal strategy.
Finally, using again the linear model reported in Table 7 to estimate the values of the CD4 score of the optimal patients, we observe from the final Table 13: the optimal patient, Ave-max, average on all respects except for the selected active items, will have a predicted value for his CD4 count of roughly 735, to be compared to the value predicted for the average patient: 446. The other optimal supplementary subjects do not differ much from the average one (see Figure  4).
One can also notice from Table 13 that the confidence intervals for both values do not intersect. The optimal is assuredly larger than the other according to our model.
In actual practice, after a first intervention has been designed and implemented according to the principles obtained from our study, new results at some further time from the IR scale would be accessible to design and implement another optimal strategy to further increase their CD4 values at later times.

Discussion
The purpose of this study is to examine the influence of the readiness characteristics of HIV patients, as measured by the IR scale, on their adherence to their usual treatments, with the CD4 counts as the indicator. And also, if possible, to define an appropriate stimulation of their readiness to increase their CD4 counts.  Table 11 as well as Table 12 gives the original English versions of the items.

83
The dimension structure of a questionnaire might be valid for the princeps sample, usually large, but experience shows that on small samples this structure is often inextricable. This is the case here, and that was also reported for some other uses of this psychometric scale (Enriquez &al., 2004).
In our case, the normal predictors in the regression models should be the scores for the dimensions defined on the IR scale. However the dimensions are not orthogonal but show large inter-correlations, even with the other validity indicators quite decent, essentially Cronbach's alphas and the test-retest correlations. In the presence of these high correlations among predictors, we cannot determine any regression model for the CD4s on the predictors. The same problem, even more acutely, appears when the 30 items themselves are used as predictors.
Some difficulties in the use of Cronbach's Alphas (Cronbach, 1951) have been known for a long time (Carmines & Zeller, 1979). Recent literature also reports much doubt as to the usefulness of the alpha to evaluate the reliability (internal consistency) for many scales, as well as to what constitute acceptable levels of the alphas (e.g. Shevlin &al, 2000;Schmitt, 1996;Tavakol, 2011). Even Cronbah, in a posthumous article (Cronbach & Shavelson, 2004, from a series of notes written by Cronbach in 1997), expresses some doubts as to the overall usefulness of the alphas to describe properly the internal consistencies of the dimensions of many multidimensional scales. He sees "the alpha coefficient within a much larger system of reliability analysis (p.416)." Other indicators have been proposed besides the alphas and studied extensively (e.g. Sijstma, 2009;Revelle & Zinbarg, 2009). This notwithstanding, alphas are still much in favor and routinely exclusively reported (Wigley, 2011). Our interest here does not lie with the indicators for the reliability or internal consistency of the IR scale, but with its concurrent or criterion based validity (McIntire & Miller, 2010).
More importantly for the usability of the Coefficients alpha, we have a case in point here of a fact noted by quite a few authors: a high C-Alpha does not indicate unidimensionality of a concept or dimension (Green &al., 1977;Schmitt, 1996;Sijtsma, 2009;Tavakol, 2011;Wigley, 2011). Not only do we observe large inter-dimensional high correlations, but the intra-dimensional structure is also inextricable.
Perhaps our problem stems from the small princeps sample size that determined the IR Scale (Fleury, 1994). Less than 150 subjects for the deployment of the correlation properties of 30 items are not totally adequate. But in the health studies it is often quite difficult to obtain samples sizes large enough to properly validate scales of measurements DeVellis, 2011). However, the concept of readiness and its relations to adherence for chronic diseases is important (e.g. Enriquez &al., 2004;Sorenson &al., 1998). Our solution to the problem of correlated multidimensional scales appears to be a way out of the problem for the IR scale, and maybe for other scales.
We have explored a methodology based on partial principal components analyses (P-PCA) defined on the IR items from each of its dimension: each one exhibits not one but 3 'significant' components according to the Kaiser criterion and Scree diagram. We call these 3 partial components pseudo-items for the dimensions, since we can calculate scores, or values along each of them for each patient.
The ad hoc heuristic validations of the P-PCA procedure that was devised point to the fact that the 9 pseudo-items do not appear to be spurious random effects. And, indirectly, provide some validation of the scale itself with its proposed separation into three sets of items for each dimension.
The first and main P-PC for each of the dimension is a scaling component, showing that the items, at least on that component, provide a decent representation of the dimension: indeed a scale, hence the name for such a component. But still, we cannot determine a regression model for CD4 on the 3 scaling components representing the IR dimensions.
Actually the correlation structures on the first P-PCs are not the only ones to describe the structural properties of the readiness dimensions, even if by nature a scaling component refers to a valid dimension representative. In fact, our other P-PCs describe an opposition structure between 2 important aspects of the items: the first aspect is related to thinking or 'contemplation', the other one to 'action'.
It is precisely this opposition exhibited by the opposite correlations of the main items defining the pseudo-items with the pseudo-items that can be used to restore the predictive capacities of readiness for the CD4 counts, and define what attitude could be activated to increase the CD4 counts, which is crucial to the HIV retroviral therapy: getting the patients out of a contemplating or brooding attitude, in favor (by virtue of opposite correlations) of increased action, that is setting of plans and goals. It is by stimulating these oppositions that we can define an intervention among the HIV patients to favor their adherence to treatments based on their IR item scores.
We were indeed able to perform a valid regression for the response, the CD4 counts, on the 9 pseudo items. We have inferred, through forward stepwise regression, the effect of 2 significant partial components, or pseudo-items, on the response: one from the "Identify the barriers" pseudo-items (the third one), and the other from the "Commitment" pseudo-items (the second one).
And by considering the sign of the regression coefficients of the 2 significant pseudo-items, as well as the high correlations (loadings) of the items with the active P-PCs for which a cutoff value of 0,3 was used, and using their opposition structure, we could devise an optimal intervention to maximize the CD4 counts for a subject.
In brief one can define an intervention on the Readiness as is needed in order to reach my goals Increase of a patient by the motto: "Enough thinking and pondering, action". The effect of the intervention could be quantified from the regression model. We have used the technique of defining illustrative subjects, not used in the principal components analyses but nonetheless projected on appropriate factor planes to visualize the expected effects. The absence of influence on the response from the other non-active items was also established through the use of various illustrative subjects. We have performed extensive validations for our final regression model -as well as for all of them-, obtained through a stepwise procedure for the 9 predictors, and we are fairly confident on its quality. All of them exhibited a very correct and stable model. We have checked that only the items with high enough correlations with the active pseudo-items are active on the regression, the others have a negligible effect.
There is another validation that had to be performed. The CD4 counts are known to be somewhat dispersed for a given patient at any given time (Vedhara et al., 1997). The usual validation procedures in such a case are to categorize the CD4 counts along certain and varied cutoff points, and validate the regression models for these categorized indicators. The results were not much different, qualitatively speaking, from the ones obtained on the continuous measured CD4 counts validating thus the use of the imprecise CD4 counts: even though the exact coefficient values differed, the 2 active P-PCs were retrieved. And so then the same type of optimal intervention.
No data analysis should come without its validation.
Validation procedures were used extensively throughout this report. Only a few were described here. A final validation cannot be performed though: indeed the only way to show the correctness of our prescriptions would be to intervene within a controlled experiment, and observe its effect on the CD4s for the patients submitted to an adequately defined intervention, compared to a control group.

Conclusion
By using PCA analysis on each subset of the IR items pertaining to each of its 3 dimensions, a technique that we call partial principal components analysis, P-PCA, we were able to disentangle the 3 dimensions of the IR scale. We have thus obtained 3 sets of 3 pseudo-items describing the properties of the IR dimensions. These pseudo-items are the 3 main P-PCs for each dimension.
A regression model for the CD4 counts response on the 9 pseudo-items provided 2 of these pseudo-items that have a significant role for an increase of the CD4 counts, the active pseudo-items.
By studying the correlation structures of the 2 active pseudo-items with their respective items, we were able to determine the optimal values of the active items from the original scale that would give maximum predicted CD4 counts. By 'active items' within a dimension, we mean the ones with high correlations (loadings), defined by a standard cutoff point, with the active pseudo-item.
Artificial or illustrative subjects with these optimized values were projected on the appropriate factor spaces, illustrating their CD4 inferred values from our model.
An intervention based on the following motto "No more pondering, action!", devised from the active items, should increase the CD4 counts. Optimal outcomes for the CD4 counts are forecasted by our model. This is actually the crucial validation in our context. We have almost left outside our research the data at Time 2 that was observed on the same subjects and in the same conditions as for Time 1 three months earlier: the same subjects were observed without any designed experiment in view, exactly in the same conditions. We have a need for this validation stemming from such an experiment.
Another tack for the validation of our methodology: we could eventually apply it to other valid multidimensional scales with entangled dimensions, or even with fairly independent dimensions. Indeed, even in the case of non or close to non correlated dimensions, we could apply the P-PCA analyses as described here. We have seen in our case that the first and main PCs of each dimensions do not say the final word on the dimensions, they are scaling components. But the real nature of the dimensions is derived through the opposition structures exhibited by the next significant PCs. It is from these secondary components that we get their real nature, from which it was possible here to restore the concurrent validity of the IR scale. What would be the situation with other scales?