This article outlines the rigorous study selection process employed to identify and analyze the most common diagnoses encountered in primary care settings. The methodology emphasizes a systematic approach to ensure the inclusion of relevant studies while maintaining data integrity and minimizing bias.
Inclusion and Exclusion Criteria
The selection process began with title and abstract screening, followed by a full-text review of articles conducted by three independent reviewers. Studies were included based on the following stringent criteria:
- Setting: Studies must be based in general practice or primary care environments.
- Reasons for Visit (RFVs): A minimum of 10 distinct RFVs had to be reported within each study to ensure a comprehensive overview of common primary care presentations.
- Study Population Size: To ensure robust data, studies were required to include a minimum of 20,000 patient visits or involve at least 5 clinicians over a period of one year or more. Alternatively, studies covering 7,500 patients over a year or more also met the population criteria. These thresholds were established to represent a substantial primary care practice volume, approximating a practice with 5 clinicians, each seeing 20 patients daily, over 200 working days annually. Equivalencies were calculated based on 1500-patient panels per physician.
- Observational Design: Only observational studies were considered for inclusion, focusing on real-world primary care encounters.
Conversely, studies were excluded if they exhibited any of the following characteristics:
- Specific Visit Types: Studies focusing solely on specific visit types, such as periodic health examinations, were excluded to maintain a broad focus on general primary care presentations.
- Specific Conditions: Research limited to specific conditions or problems, such as acute conditions only, was not included to ensure a comprehensive view of common diagnoses.
- Selected Populations: Studies focusing on narrow populations, like adolescents exclusively, were excluded to generalize findings across broader age ranges in primary care.
- Referral-Based Visits: Studies where visits originated from referrals (e.g., to pediatrics or internal medicine specialists) were excluded to concentrate on primary care’s initial diagnostic landscape.
- Publication Date: Studies published before 1996 were excluded to ensure the data reflected contemporary primary care practices.
In cases of multiple publications from the same data source, preference was given to the most recent and complete datasets with the most detailed information. Duplicate publications were only included if they presented distinct analyses of the data, such as subgroup analyses. Disagreements during the review process were resolved through consensus or third-party adjudication. Authors were contacted to obtain additional or unpublished data when necessary. Non-English articles were translated using Google Translate to broaden the scope of included research.
Data Extraction and Synthesis
Data extraction was performed independently by two reviewers. The primary outcome of interest was the reported Reason for Visit (RFV), defined as the patient’s presenting complaint or the problem managed by primary care physicians. For each of the top RFVs (up to 20 per study), the number, percentage, or rate of associated visits was recorded. Descriptive study characteristics were also collected, including whether RFVs were patient- or clinician-reported, total visits, clinician or practice numbers, data collection location and duration, patient demographics (gender and age distribution), and the coding system used (e.g., International Classification of Primary Care, ICD-9, ICD-10).
Risk of Bias Assessment
To evaluate the robustness of each study, a risk of bias assessment was conducted using five key characteristics, scored from 0 (high risk) to 1 (low risk). These characteristics included:
- Representative Clinician Sample: Studies scored positively if they included ≥ 2 of these criteria: both male and female clinicians, no limitations on years in practice, and no restrictions on practice size.
- Representative Patient Sample: Positive scores were given for studies with ≥ 2 of these: both male and female patients, a mix of urban and rural settings, and no age group limitations.
- Data Collection Method: Prospective data collection scored 1 (low risk), while retrospective scored 0 (high risk).
- Coding System Specification: Studies clearly specifying their coding system scored 1, while those not specifying scored 0.
- Data Collection Duration: A data collection period of ≥ 1 year scored 1, indicating a more robust and representative dataset.
Data Categorization and Analysis
Reported RFVs were categorized into “general” (broad groupings like “respiratory”) and “specific” categories (precise diagnoses like “pneumonia”). A standardized coding scheme was applied within each category. For instance, in the specific RFV category, terms like “back complaint,” “dorsopathies,” and “neck pain” were consistently coded under “back pain/spinal pain.” Detailed diagnostic coding legends are available at CFPlus.
To analyze the most common visits, RFVs from each study were ranked by frequency. Due to inconsistent reporting methods (number, percent, or rate of visits), the rank of each RFV was used as the measure of relative frequency. For each study’s top 20 RFVs, ranks were assigned from 20 (most common) to 1. RFVs outside the top 20 received a rank of zero. These ranks were then combined, and mean ranks were calculated for each RFV. The RFVs with the highest mean ranks were identified as the most commonly encountered. RFVs present in only one study were excluded from the combined analysis to ensure generalizability.
Secondary analyses included comparing clinician-reported RFV mean ranks across countries categorized by economic development status (developed vs. developing, using United Nations classifications). Subgroup analyses from included studies (e.g., by clinician or patient sex, practice setting) were also combined using the same ranking approach, provided that subgroups were represented in at least two studies.
This rigorous and systematic study selection and analysis methodology provides a robust foundation for understanding the most common diagnoses in primary care, crucial for resource allocation, medical education, and public health planning.