Single center Automated, Multi-Source deeply Phenotyped Heart Transplant Registry as a template to build tailored data infrastructure
Patel, K.; Eager, T. N.; Ghobrial, M.; Moore, L. W.; Guha, A.; Martin, C.; Akay, M. H.; Loza, L.; Jones, S. L.; Gaber, A. O.; Bhimaraj, A.
Show abstract
BackgroundTraditional heart transplant registries often lack the granularity required for deep phenotyping and rely on labor-intensive manual abstraction. We describe the methodology and validation of a next-generation, automated, multi-source registry designed to address these limitations. MethodsUtilizing a High-Performance Computing environment, we integrated structured data from Epic data warehouses (Clarity and Caboodle), external molecular diagnostics, and verified UNOS survival records. A custom deterministic rule-based Natural Language Processing (NLP) engine was developed to extract echocardiographic measures, rejection grades, and vasculopathy scores from over 21,000 unstructured clinical reports. ResultsThe Houston Methodist J.C. Walter Jr. Transplant Center Precision Registry and Platform-Heart (TCPR-Heart) captures 1,687 heart transplants (1,636 patients) spanning the years 1984-2025. The TCPR-Heart comprises 1,054 transplants with active clinical follow-up: 555 transplants were extracted and abstracted from our modern electronic health record (EHR) in the decade since deployment, providing access to data throughout the patients course of heart transplant; 427 were legacy active transplants (transplanted pre-2016 with continued follow-up), and 72 were external transplants (transplanted elsewhere but followed at Methodist). Additionally, the registry houses a historic cohort of 633 transplants (last follow-up < June 2016) with limited variables. Automated deep phenotyping successfully generated longitudinal data trends across clinical domains, including immunosuppression strategies, rejection, immunologic HLA data, renal function, metabolic profiles, vasculopathy, graft function, hospitalization burden and survival information. ConclusionThis automated framework unifies clinical, administrative, and molecular data streams. By leveraging an automated, regularly updated registry, we established a scalable, high-fidelity data source as a foundation for further innovations and novel applications based on an expertly curated and validated data source.
Matching journals
The top 8 journals account for 50% of the predicted probability mass.