Abstract
Objective
Predicted adult height (PAH) can be calculated using methods such as Bayley-Pinneau (BP), Roche-Wainer-Thissen (RWT), and BoneXpert based on bone age (BA) assessment. Since these methods were developed for healthy children, varying results have been reported regarding their efficacy across different patient groups. Our aim was to determine the most accurate method for PAH by comparing the BP, RWT, and BoneXpert methods in boys with constitutional delay of growth and puberty (CDGP).
Methods
Male patients with CDGP who had reached their final height (FH) were included in the study. Two experienced clinicians reassessed left-hand and wrist radiographs taken at the time of diagnosis using the Greulich-Pyle (GP) atlas to manually determine BA. Among the methods used for PAH, the GP atlas was used for BP and RWT, while we used the intrinsic GP-based application with BoneXpert.
Results
For the 62 boys included, the mean age at diagnosis was 14.2±0.8 years, with 58.1% (n=36) having a similar family history. The mean height standard deviation (SD) score was -2.1±0.9, and 24.2% (n=15) of patients received low-dose testosterone induction therapy. The median (range) BAs were 12.5 (11.5-13.0) years using the GP atlas and 12.6 (11.8-13.4) years with BoneXpert (p<0.001). Boys who were or were not treated with testosterone therapy had similar mean height SD scores, median testicular volumes, and median BAs assessed by both methods. The mean target height and FH SD scores were -0.6±0.6 and -0.6±0.9, respectively (p=0.8). Almost all patients (n=60, 97%) achieved adult height within the target range, with no significant difference in the FH SD score between boys who received testosterone and those who did not (p=0.1). There was no significant difference between the FH and PAH when estimated by the BP and RWT methods (p=0.2 and p=0.6, respectively), while the BoneXpert method underestimated the FH (p<0.001). The BP and RWT methods provided better predictions in patients with BA ≤2 years compared to BoneXpert (p=0.3 and p=0.4 vs. p<0.001, respectively). Conversely, RWT and BoneXpert methods were more accurate in PAH in boys with delayed BA >2 years (p=0.1 and p=0.1, respectively), while the BP method resulted in overestimation (p=0.003).
Conclusion
The RWT method was found to be a better predictor of FH compared to the BP or BoneXpert methods in boys with delayed BA ≤2 years and >2 years.
What is already known on this topic?
Predicted adult height can be calculated using methods, such as the Bayley-Pinneau, Roche-Wainer-Thissen (RWT), and BoneXpert, which rely on bone age assessment. However, as these methods were originally developed for healthy children, their predictive accuracy has shown variability when applied to different patient groups.
What this study adds?
This study demonstrated that the RWT method was more effective than other methods for estimating adult height in boys with delayed bone age, irrespective of whether the delay was ≤2 years or >2 years.
Introduction
Constitutional delay of growth and puberty (CDGP) is classified as the most prevalent cause of short stature and delayed puberty, primarily in boys. This temporary condition is considered a normal growth spectrum, characterized by a slowdown in linear growth, retarded bone maturation, and delayed onset of puberty and, consequently, the pubertal growth spurt. The exact etiology of CDGP remains unclear, but about 50-75% of patients have a family history of delayed puberty, often following an autosomal dominant inheritance pattern. Typically, these children are expected to reach normal adult height after experiencing delayed but otherwise normal puberty (1, 2, 3).
In the management of boys with CDGP, reassurance and watchful waiting are generally adequate. However, for adolescents experiencing psychological challenges and low self-esteem, medical therapy with low-dose testosterone and psychological counseling may be necessary (1, 2, 3, 4). Clinicians often use predicted adult height (PAH) based on bone age (BA) assessments to inform patients and their parents about future growth potential. In these patients, PAH can be calculated using several methods, including the Bayley-Pinneau (BP), Roche-Wainer-Thissen (RWT), and BoneXpert application, which were originally developed for healthy children (5, 6, 7, 8, 9, 10). In addition, delayed BA is a known factor that can limit the accuracy of adult height prediction. So far, only a few studies have evaluated prediction methods in patients with CDGP, and there is limited and conflicting information about their accuracy (5, 7, 8, 9, 10, 11).
In the present study, we aimed to assess the accuracy of the BP, RWT, and BoneXpert methods for estimating adult height in boys with CDGP to provide them with realistic and more accurate information about their future height potential.
Methods
Patients
Boys diagnosed with CDGP who were referred to pediatric endocrinology units for evaluation of short stature or delayed puberty between 2010 and 2018 and who had achieved their final height (FH) were included. To increase the sample size, extended criteria for delayed puberty were applied (12, 13). The inclusion criteria were: (i) boys aged 13 years or older with a testicular volume less than 6 mL, as measured by a Prader orchidometer, who exhibited spontaneous pubertal development before the age of 18 or after pubertal induction with low-dose testosterone; (ii) absence of any endocrine or chronic medical condition; and/or (iii) a familial history of pubertal delay. Male subjects who were born small for gestational age, had received any medication, or had systemic diseases, dysmorphic syndromes, skeletal abnormalities, or pituitary hormone deficiencies were excluded.
FH was defined based on the following criteria: (i) fused epiphyses; (ii) a growth velocity of less than 1.0 cm in the preceding year; and (iii) completed secondary sexual characteristics. Boys were considered to have achieved their target height if their FH was within the 1.5 standard deviation (SD) score of the target height.
Clinical Assessment
Data were retrospectively gathered on age, anthropometric measurements, physical examination findings, and parental height, as well as laboratory and radiological findings from patient medical records. Height was measured to the nearest millimeter using a Harpenden stadiometer, and weight was measured with a SECA scale (Hamburg, Germany) to an accuracy of 0.1 kg, with patients wearing only underwear and no shoes. SD scores for height, weight, and body mass index (BMI) were calculated using an online calculator (child metrics) based on Turkish standards published by Neyzi et al. (14). The pubertal stage was assessed using Tanner’s standards (15). The genetic target height was calculated by adding 6.5 cm to the average parental height, following the Tanner formula (16). The levels of luteinizing hormone (LH), follicle-stimulating hormone, serum total testosterone (ng/mL) obtained at 8:00 am were recorded, together with the gonadotropin-releasing hormone-stimulated LH levels from the patient files.
Bone Age Assessment and Adult Height Prediction Methods
The BAs of boys with CDGP were reassessed using both manual and automated methods. Initially, BAs were independently re-evaluated by two experienced clinicians (AA and GAK) using left-hand and wrist radiographs, according to the Greulich-Pyle (GP) atlas (17). The manual BA for each patient was subsequently calculated by averaging these independent assessments. Radiographs were stored on a PACS workstation, and all images were uploaded in DICOM format before being analyzed by BoneXpert software. Subsequently, the same radiographs were evaluated using the automated BA assessment method provided by BoneXpert Standalone, based on the GP atlas (Visiana, Holte, Denmark, www.boneXpert.com) (18).
Adult height predictions were made using the BP, RWT, and BoneXpert methods. The GP atlas was used for both the BP and RWT methods. For the BoneXpert method, two approaches were employed: (1) PAH was calculated using the BA automatically generated by the BoneXpert software, which is based on the GP atlas, after uploading left-hand X-rays; and (2) PAH was calculated by entering the BA manually assessed using the GP atlas into the BoneXpert’s web page. Manually assessing BA using the GP atlas and applying it in the BoneXpert method for PAH calculation is routine practice in our clinics.
The estimated adult height for the BP and RWT methods was calculated using the online calculator (child metrics, www.childmetrics.org) (19). For the BP method, this calculator uses tables mentioned in the study by Post and Richman (20), which provide decimal fractions indicating the proportion of adult height attained at various BAs. These fractions are categorized based on whether the BA is average (within one year), delayed, or advanced relative to chronological age. PAH is determined by dividing the current height by the corresponding decimal fraction for the patient’s BA.
The RWT method calculates PAH based on the child’s height (standing height was used instead of recumbent length due to the retrospective nature of data collection in this study), weight, and BA, incorporating mid-parental height and using sex- and age-specific coefficients (21). While recumbent length is generally recommended for younger children, standing height is more practical and is commonly measured in clinical practice for children older than two years. According to the World Health Organization Child Growth Standards, recumbent length is, on average, 0.7 cm greater than standing height (22). Although this small difference could theoretically affect PAH calculations, given the retrospective design of our study and the minimal variation between standing height and recumbent length, the impact on RWT prediction accuracy is expected to be negligible.
The BoneXpert method, available as a free online calculator at http://www.boneXpert.com/adult-height-predictor, is based on BA, chronological age, gender, height, father’s height, mother’s height, and ethnicity. For the ethnicity parameter, we selected the Caucasian European South population, as it most closely matches the Turkish population (23, 24). In Türkiye, the average height is reported as 163.1 cm for females and 176.2 cm for males, whereas in the Caucasian European South population, the corresponding averages are 162 cm for females and 175 cm for males (14, 25).
Ethics
This study was approved by the Local Ethics Committee of Dokuz Eylül University Faculty of Medicine (approval number: 2024/05-21, date: 07.02.2024) and performed in line with the principles of the Declaration of Helsinki. An informed written consent form was not obtained due to the retrospective nature of the study.
Statistical Analysis
Statistical analyses were conducted using Statistical Package for the Social Sciences for Windows, version 24.0 (IBM Co., Armonk, NY, USA). The normality of the data was assessed with the Kolmogorov-Smirnov and Shapiro-Wilk tests. Clinical data were reported as numbers (%) for categorical variables, mean±SD for continuous variables with a normal distribution, and medians with the respective 25th-75th percentile values for non-normally distributed variables. Comparisons between categorical variables were performed using the Pearson chi-square test or Fisher’s exact test, as appropriate. For continuous variables, the Student’s t-test was used to compare normally distributed data between the two groups, while the Mann-Whitney U test was applied for non-normally distributed data.
The interclass correlation coefficient (ICC) was used to evaluate interobserver agreement, with ICC values interpreted as follows: excellent (greater than 0.9), good (0.75 to 0.9), moderate (0.5 to 0.75), and poor (less than 0.5). The Wilcoxon signed-rank test was employed to compare two related samples, with a p value of <0.05 considered statistically significant.
Friedman’s two-way analysis of variance was employed to compare differences between predicted heights and FHs. If a significant difference was found, the Bonferroni post-hoc test was conducted for pairwise comparisons, with a p value of <0.0167 considered significant, calculated using the formula p=0.05 × 2/k (k-1), where k is the number of comparisons.
Results
Baseline Characteristics of Study Subjects
This study included sixty-two patients, with a mean age at diagnosis of 14.2±0.8 years. Fifty-six subjects (90.3%) were born with normal birth weights, while the remaining subjects were born large for gestational age. Thirty-six patients (58.1%) had a family history of CDGP. The main reasons for presenting to pediatric endocrinology clinics were short stature (n=56, 90.3%) and delayed puberty (n=5, 8.1%).
The baseline characteristics of the study subjects are presented in Table 1. The mean SD scores for height and BMI were -2.1±0.9 [(-3.5) - (0.3)] and -0.8±1.5 [(-4.3) - (2.6)], respectively. According to their age and sex, 36 subjects (58.1%) exhibited short stature, and 12 patients (19.4%) had a low BMI at the first evaluation.
Fifteen patients (24.2%) received low-dose testosterone therapy. Patients who underwent testosterone induction therapy were older, taller, and had a higher BMI SD score compared to those who did not (p=0.02, p=0.04, and p=0.01, respectively). However, the treated and untreated groups were similar in terms of height SD score, and testicular volume at presentation (p=0.4 and p=0.5, respectively) (Table 1).
Bone Age Assessment
The median BAs evaluated using the GP atlas and the BoneXpert method were 12.5 (11.5-13.0) and 12.6 (11.8-13.4) years, respectively (p<0.001). For the GP atlas when determining BAs, the interobserver coefficient of variation was 0.964 (95% confidence interval 0.941-0.979). The median BA retardation was 2.0 (1.3-2.6) years when BA was assessed by the GP atlas and 1.6 (1.0-2.3) years when determined by the BoneXpert method (p<0.001). The median BAs of boys, whether they were treated or not treated with testosterone, were found to be similar (p=0.09 for the GP atlas and p=0.1 for the BoneXpert method).
Evaluation at 12 Months of Follow-up
We had first-year data available for 36 (58.1%) patients. After a year, the mean age of these patients was 15.2±0.9 years. Of these 36 patients, 13 (36.1%) received low dose testosterone treatment. The mean height SD score was -2.0±1.1, while the height velocity was 7.3±2.6 cm/year. The height velocity was not significantly different in boys who received testosterone treatment and those who did not (8.1±2.3 cm/year vs. 6.8±2.7 cm/year, p=0.6). The median testicular volume was 10 (8-12) mL for all the patients. The median testicular volume in the treated and untreated groups at 12 months of follow-up was also similar (p=0.4).
Characteristics of Boys at Final Height
The median age of the patients at their FH was 19.4 (18.5 to 20.3) years. They had mean target height and FH SD scores of -0.6±0.6 [(-2.1) - (1.0)] and -0.6±0.9 [(-2.5) - (1.4)], respectively (p=0.8). The difference in SD scores between the target height and the FH was -0.04±0.8 [(-2.2) - (2.0)]. Ninety-seven percent of the patients (n=60) reached an adult height within the target height range. Adult height was less than 165 cm in only four boys (6.5%). Furthermore, there was no significant difference in the FH SD score between boys who received testosterone and those who did not (p=0.1). No significant difference in target height was found between the two groups (p=0.5).
Comparison of Adult Height Prediction Methods
The median PAH SD scores calculated using the BP, RWT, and BoneXpert methods were -0.5 [(-1.3) - (-0.1)], -0.6 [(-1.0) - (0.0)], and -1.1. [(-1.6) - (-0.4)], respectively. Among the PAH methods, there was no significant difference between the FH and the PAH estimated by the BP and RWT methods (p=0.2 and p=0.6, respectively) (Table 2). Consequently, the BP and RWT methods provided more accurate predictions for boys with CDGP. Notably, the BoneXpert method underestimated the PAH in these patients (p<0.001) (Figure 1). For the BP, RWT, and BoneXpert methods, the differences between the SD scores of PAH and FH were 0.2 [(-0.4) - (0.7)], 0.03 [(-0.5) - (0.5)] and -0.4 [(-0.9) - (0.1)], respectively (Table 3). The median difference between PAH and FH for the BP and RWT methods was similar (p=0.2). However, the median differences between PAH and FH for the BP and RWT methods were also significantly higher than the BoneXpert method (p<0.001 and p<0.001, respectively).
In the subgroup analysis of boys with delayed BA ≤2 years or >2 years, the BP and RWT methods gave better predictions in patients with delayed BA ≤2 years (p=0.3 and p=0.4, respectively). In this subgroup, the BoneXpert method underestimated the PAH. Furthermore, the RWT and BoneXpert methods were more accurate in predicting PAH in boys with delayed BA >2 years (p=0.1 and p=0.1, respectively). However, the BP method resulted in overestimation in the same subgroup (Table 4). Consequently, the RWT method was the best predictor of FH among the three different methods in boys with delayed BA of both ≤2 years and >2 years.
Discussion
In the literature, there is uncertainty about whether boys with CDGP can achieve their target height, regardless of whether they receive low-dose testosterone induction therapy or not. Moreover, studies evaluating adult height prediction methods for this population often show conflicting results, partly due to limited patient numbers. To address this issue, we collected data from a relatively large group of patients and compared the accuracy of the BP, RWT, and BoneXpert methods. Our findings indicated that the mean FH SD scores of boys with and without pubertal induction were similar, and the majority of patients were able to reach their target heights. Furthermore, this study demonstrated that the RWT method was more effective than other methods for estimating adult height in boys with delayed BA, regardless of whether the delay was ≤2 years or >2 years.
Pharmacological induction of puberty to accelerate the pubertal growth spurt and enhance statural outcomes in boys with CDGP remains contentious. While some researchers suggest that this therapy negatively impacts FH (25), other studies indicate no significant effect of testosterone treatment on FH or PAH in boys with CDGP (8, 26, 27, 28). For instance, Arrigo et al. (8) found no significant difference in FH between boys with CDGP who received low-dose testosterone therapy and those who did not, with similar height SD scores at diagnosis. Similarly, Kelly et al. (28) observed no significant difference in FH between the treated and untreated groups; however, they noted that the FHs of treated boys were closer to their genetic target heights compared to untreated boys, whose FHs were below their target heights. Consistent with these findings, our study also demonstrated no significant difference in FH SD scores between treated and untreated boys, suggesting that testosterone therapy neither improves nor impairs FH outcomes in boys with CDGP. Notably, even high-dose testosterone therapy has been reported to have no significant effect on the height-for-BA SD score, as shown by Büyükgebiz (29), further supporting the conclusion that testosterone therapy does not significantly influence FH outcomes in boys with CDGP.
Several studies have demonstrated that patients with CDGP may not reach their genetic target height (7, 30, 31). Poyrazoğlu et al. (7) reported that the FH of patients with CDGP was below their target height, with 46.3% of patients unable to attain their target height. Similarly, in a study of 15 boys, Rohani et al. (30) found that the FHs of the subjects were considerably less than their target heights (165.7±2.89 cm vs. 171.8±4.65 cm). In contrast, von Kalckreuth et al. (6) noted that their patients achieved their genetic target height without needing growth-stimulating therapy. In another retrospective study evaluating boys with CDGP, it was reported that the FHs of the patients were similar to their target heights. In addition, among those who did not undergo testosterone therapy, 3 out of 27 patients did not reach their target height, while only 1 out of 22 treated patients failed to reach their target height (8). In the present study and consistent with previous research, we found no significant difference between the FH and the target height. Most of our cohort reached an adult height within their expected target range. Only four boys had an adult height below 165 cm, reflecting familial height characteristics and influences.
There are published data comparing adult height prediction methods for boys with CDGP (5, 6, 7, 8, 9, 11, 28). For instance, a study involving 14 male patients with CDGP found that height predictions using the BP method (173.9±7.5 cm) were highly accurate when compared to the FH (171.3±5.3 cm) (6). In the study by Arrigo et al. (8), no significant differences were found between the final adult height and the PAH calculated using the BP method in both testosterone-induced and non-induced groups. However, they observed discrepancies greater than ±5 cm between FH and PAH in 33% of non-induced subjects and 23% of induced subjects. Kelly et al. (28) suggested that the Tanner and Whitehouse RUS (TW2) method is useful and accurate. They observed that the FHs of boys with CDGP were closely related to the estimated heights, with only three patients having FHs below the predicted range. According to Poyrazoğlu et al. (7), the BP method provided a very reliable estimation of adult height compared to the TW method.
Our findings indicated that the RWT method outperformed other methods in estimating adult height for boys with delayed BA, irrespective of whether the delay was ≤2 years or >2 years. Consistent with this, Brämswig et al. (9) concluded that the RWT method was the most accurate, while the BP method overestimated adult height in their cohort of 37 boys with untreated short stature and CDGP. Similarly, Reinehr et al. (11) reported that the BP method overestimated adult height, particularly in boys with a delayed BA of 2 years or more. To address this, they developed a new prediction model specifically for patients with CDGP, which they stated had a good predictive capability for subjects with retarded BA.
In contrast, Unrath et al. (5) found that the BoneXpert method, which incorporates parents’ heights, was more accurate in predicting FHs than the BP method in a cohort including boys with CDGP. Their study compared automated BA assessments using the BoneXpert software with manual BA assessments performed with the GP method. When the mean of BAs, blindly re-evaluated by three experienced pediatric endocrinologists, was considered the ‘Reference’ BA, it was found to be closer to the manual BA than the automated BA. The automated BA slightly overestimated BA, while the manual BA values were generally lower than the reference BA. Furthermore, using manual BA instead of automated BA in the BoneXpert adult height prediction calculator resulted in a slightly weaker, but still good, performance.
These studies have shown varying results for height prediction models in boys with CDGP. In the present study, the PAH estimated using both the BP and RWT methods was very closely aligned with, and was not significantly different from, the FH. In contrast, the BoneXpert method, whether using manual or automated BA assessments, underestimated the PAH in these patients. This underestimation may be attributed to several factors. First, BoneXpert relies on generalized growth models that may not fully account for the dynamic and individualized growth patterns of boys with CDGP, particularly those with delayed bone maturation and pubertal onset (5). Second, as observed in both our study and the findings of Unrath et al. (5), automated methods like BoneXpert tend to slightly overestimate BA compared to manual assessments, leading to discrepancies in adult height predictions. Furthermore, our study found a significant difference between the median BAs obtained through manual and automated methods, with the automated BA consistently being more advanced. This discrepancy highlights that automated systems like BoneXpert may have inherent margins of error, despite their standardization and efficiency, raising questions about their reliability. Future studies with larger sample sizes are needed to refine these models and enhance their predictive accuracy and clinical applicability.
BA retardation may result in inaccuracies in adult height predictions. Notably, in the studies cited above, the commonly used BP method tends to overestimate adult height in boys with CDGP (9, 11). In the present study, the BP and RWT methods were found to be more accurate for individuals with a delayed BA of less than two years. However, for boys with a BA delayed by more than two years, the BP method tended to overestimate, consistent with findings from previous studies. Conversely, the RWT and BoneXpert methods were found to be more reliable for these patients. This study demonstrated that the RWT method is the most accurate predictor of adult height, regardless of the magnitude of delay in BA in boys with CDGP. Its incorporation of multiple growth parameters, including height, weight, mid-parental height, and sex- and age-specific coefficients, likely accounts for its superior performance, particularly in patients with complex and variable growth patterns. In summary, these findings highlight the clinical utility of the RWT method as a reliable and precise tool for estimating adult height in boys with CDGP, even in the presence of delayed BA.
Study Limitations
The current study has several limitations. First, the inclusion criteria for study participants were somewhat extended to increase the sample size. In addition, the patients were recruited from different centers, resulting in a heterogeneous population; some patients received testosterone therapy, while others did not. Medical therapy was administered specifically to adolescents experiencing psychological challenges. Furthermore, there is inter- and intra-observer variability in the manual assessment of BA. Nonetheless, all radiographs were re-evaluated by two experienced pediatric endocrinologists, who demonstrated excellent agreement in BA determinations. Finally, the RWT method was originally designed for calculation using recumbent length, but due to the retrospective nature of the study, we used standing height measurements instead. Standing height is approximately 0.7 cm less than the recumbent length in children over two years old, which may affect RWT-based predictions.
Conclusion
In conclusion, the present study showed that low-dose testosterone induction therapy did not negatively impact FH, and both treated and untreated boys attained heights in line with their genetic target heights. Furthermore, the RWT method appears to be more suitable for accurate height estimation, especially in conditions such as CDGP, which is characterized by delayed BA. Future research should focus on developing disease-specific prediction models that offer superior advantages over traditional methods for predicting adult height in boys with CDGP.


