Background: Primary care datasets offer valuable longitudinal data for clinical research and health policy. However, Ireland’s primary care data infrastructure remains limited, with concerns about inconsistent diagnostic coding. Previous studies have highlighted gaps in coding practices for chronic conditions but similar validation for cancer diagnoses is lacking. This study examines the research utility of Irish GP data.
Aim:
To estimate cancer incidence in a high-risk cohort aged over 60 using primary care diagnostic codes and compare these rates with age- and sex-adjusted incidence rates from the NCRI.
To assess inter-practice variability in coding and identify factors influencing coding accuracy.
Methods: We conducted a retrospective cohort study using anonymised data from 43 GP practices in the Irish Primary Care Research Network (IPCRN), following RECORD guidelines. Data spanning 1 January 2011 to 5 April 2018 was extracted using a standardised tool. Cancer cases were identified using ICD-10 and ICPC-2 codes. Incidence rates per 100,000 person-years were calculated and compared with NCRI data. Inter-practice variability in coding was assessed. Results will be expanded to the CRADLE dataset, which includes EHR data from 75 practices representing 600,000 patients.
Initial Results: The initial cohort included 51,160 patients with a mean follow-up of 5.3 years and 3,432 new cancer cases identified. Prostate, leukaemia, and cervical cancer were the most accurately coded. For 16 cancers, observed incidence significantly differed from NCRI estimates (p < 0.05). Substantial inter-practice variability was evident, with coding rates ranging from 0.03 to 54.2 codes per patient. ICPC-2 was the preferred coding system, but ICD-10 was used more consistently for certain cancers. Implications: These findings highlight discrepancies between cancer incidence in primary care data and national registry rates, underscoring the need for improved coding practices. Addressing these issues could enhance primary care datasets' utility for cancer research and surveillance.