
- <Centre d'Information et de documentation du CRA Rhône-Alpes
- CRA
- Informations pratiques
-
Adresse
Centre d'information et de documentation
Horaires
du CRA Rhône-Alpes
Centre Hospitalier le Vinatier
bât 211
95, Bd Pinel
69678 Bron CedexLundi au Vendredi
Contact
9h00-12h00 13h30-16h00Tél: +33(0)4 37 91 54 65
Mail
Fax: +33(0)4 37 91 54 37
-
Adresse
Auteur BRAIN GENE REGISTRY CONSORTIUM
|
|
Documents disponibles écrits par cet auteur (2)
Faire une suggestion Affiner la rechercheAutomated extraction of functional biomarkers of verbal and ambulatory ability from multi-institutional clinical notes using large language models / Levi KASTER in Journal of Neurodevelopmental Disorders, 17 (2025)
![]()
[article]
Titre : Automated extraction of functional biomarkers of verbal and ambulatory ability from multi-institutional clinical notes using large language models Type de document : texte imprimé Auteurs : Levi KASTER, Auteur ; Ethan HILLIS, Auteur ; Inez Y. OH, Auteur ; Bhooma R. ARAVAMUTHAN, Auteur ; Virginia C. LANZOTTI, Auteur ; Casey R. VICKSTROM, Auteur ; BRAIN GENE REGISTRY CONSORTIUM, Auteur ; Christina A. GURNETT, Auteur ; Philip R.O. PAYNE, Auteur ; Aditi GUPTA, Auteur Langues : Anglais (eng) Mots-clés : Adolescent Adult Child Child, Preschool Female Humans Male Young Adult Biomarkers Cerebral Palsy/physiopathology/diagnosis Developmental Disabilities/physiopathology/diagnosis Electronic Health Records Intellectual Disability/physiopathology/diagnosis Large Language Models Natural Language Processing Registries Electronic health records Functional biomarkers Large language models Neurodevelopmental disorders recruited specifically for this study. This work constitutes secondary use of data approved by the Washington University in St. Louis IRB (protocols #202010013 [Brain Gene Registry cohort] and #202309003 [cerebral palsy cohort]). Consent for publication: Not applicable. Competing interests: The authors declare no competing interests. Index. décimale : PER Périodiques Résumé : BACKGROUND: Functional biomarkers in neurodevelopmental disorders, such as verbal and ambulatory abilities, are essential for clinical care and research activities. Treatment planning, intervention monitoring, and identifying comorbid conditions in individuals with intellectual and developmental disabilities (IDDs) rely on standardized assessments of these abilities. However, traditional assessments impose a burden on patients and providers, often leading to longitudinal inconsistencies and inequities due to evolving guidelines and associated time-cost. Therefore, this study aimed to develop an automated approach to classify verbal and ambulatory abilities from EHR data of IDD and cerebral palsy (CP) patients. Application of large language models (LLMs) to clinical notes, which are rich in longitudinal data, may provide a low-burden pipeline for extracting functional biomarkers efficiently and accurately. METHODS: Data from the multi-institutional National Brain Gene Registry (BGR) and a CP clinic cohort were utilized, comprising 3,245 notes from 125 individuals and 5,462 clinical notes from 260 individuals, respectively. Employing three LLMs-GPT-3.5 Turbo, GPT-4 Turbo, and GPT-4 Omni-we provided the models with a clinical note and utilized a detailed conversational format to prompt the models to answer: "Does the individual use any words?" and "Can the individual walk without aid?" These responses were evaluated against ground-truth abilities, which were established using neurobehavioral assessments collected for each dataset. RESULTS: LLM pipelines demonstrated high accuracy (weighted-F1 scores > .90) in predicting ambulatory ability for both cohorts, likely due to the consistent use of Gross Motor Functional Classification System (GMFCS) as a consistent ground-truth standard. However, verbal ability predictions were more accurate in the BGR cohort, likely due to higher adherence between the prompt and ground-truth assessment questions. While LLMs can be computationally expensive, analysis of our protocol affirmed the cost effectiveness when applied to select notes from the EHR. CONCLUSIONS: LLMs are effective at extracting functional biomarkers from EHR data and broadly generalizable across variable note-taking practices and institutions. Individual verbal and ambulatory ability were accurately extracted, supporting the method's ability to streamline workflows by offering automated, efficient data extraction for patient care and research. Future studies are needed to extend this methodology to additional populations and to demonstrate more granular functional data classification. En ligne : https://dx.doi.org/10.1186/s11689-025-09612-w Permalink : https://www.cra-rhone-alpes.org/cid/opac_css/index.php?lvl=notice_display&id=576
in Journal of Neurodevelopmental Disorders > 17 (2025)[article] Automated extraction of functional biomarkers of verbal and ambulatory ability from multi-institutional clinical notes using large language models [texte imprimé] / Levi KASTER, Auteur ; Ethan HILLIS, Auteur ; Inez Y. OH, Auteur ; Bhooma R. ARAVAMUTHAN, Auteur ; Virginia C. LANZOTTI, Auteur ; Casey R. VICKSTROM, Auteur ; BRAIN GENE REGISTRY CONSORTIUM, Auteur ; Christina A. GURNETT, Auteur ; Philip R.O. PAYNE, Auteur ; Aditi GUPTA, Auteur.
Langues : Anglais (eng)
in Journal of Neurodevelopmental Disorders > 17 (2025)
Mots-clés : Adolescent Adult Child Child, Preschool Female Humans Male Young Adult Biomarkers Cerebral Palsy/physiopathology/diagnosis Developmental Disabilities/physiopathology/diagnosis Electronic Health Records Intellectual Disability/physiopathology/diagnosis Large Language Models Natural Language Processing Registries Electronic health records Functional biomarkers Large language models Neurodevelopmental disorders recruited specifically for this study. This work constitutes secondary use of data approved by the Washington University in St. Louis IRB (protocols #202010013 [Brain Gene Registry cohort] and #202309003 [cerebral palsy cohort]). Consent for publication: Not applicable. Competing interests: The authors declare no competing interests. Index. décimale : PER Périodiques Résumé : BACKGROUND: Functional biomarkers in neurodevelopmental disorders, such as verbal and ambulatory abilities, are essential for clinical care and research activities. Treatment planning, intervention monitoring, and identifying comorbid conditions in individuals with intellectual and developmental disabilities (IDDs) rely on standardized assessments of these abilities. However, traditional assessments impose a burden on patients and providers, often leading to longitudinal inconsistencies and inequities due to evolving guidelines and associated time-cost. Therefore, this study aimed to develop an automated approach to classify verbal and ambulatory abilities from EHR data of IDD and cerebral palsy (CP) patients. Application of large language models (LLMs) to clinical notes, which are rich in longitudinal data, may provide a low-burden pipeline for extracting functional biomarkers efficiently and accurately. METHODS: Data from the multi-institutional National Brain Gene Registry (BGR) and a CP clinic cohort were utilized, comprising 3,245 notes from 125 individuals and 5,462 clinical notes from 260 individuals, respectively. Employing three LLMs-GPT-3.5 Turbo, GPT-4 Turbo, and GPT-4 Omni-we provided the models with a clinical note and utilized a detailed conversational format to prompt the models to answer: "Does the individual use any words?" and "Can the individual walk without aid?" These responses were evaluated against ground-truth abilities, which were established using neurobehavioral assessments collected for each dataset. RESULTS: LLM pipelines demonstrated high accuracy (weighted-F1 scores > .90) in predicting ambulatory ability for both cohorts, likely due to the consistent use of Gross Motor Functional Classification System (GMFCS) as a consistent ground-truth standard. However, verbal ability predictions were more accurate in the BGR cohort, likely due to higher adherence between the prompt and ground-truth assessment questions. While LLMs can be computationally expensive, analysis of our protocol affirmed the cost effectiveness when applied to select notes from the EHR. CONCLUSIONS: LLMs are effective at extracting functional biomarkers from EHR data and broadly generalizable across variable note-taking practices and institutions. Individual verbal and ambulatory ability were accurately extracted, supporting the method's ability to streamline workflows by offering automated, efficient data extraction for patient care and research. Future studies are needed to extend this methodology to additional populations and to demonstrate more granular functional data classification. En ligne : https://dx.doi.org/10.1186/s11689-025-09612-w Permalink : https://www.cra-rhone-alpes.org/cid/opac_css/index.php?lvl=notice_display&id=576 The Brain Gene Registry: a data snapshot / Dustin BALDRIDGE in Journal of Neurodevelopmental Disorders, 16 (2024)
![]()
[article]
Titre : The Brain Gene Registry: a data snapshot Type de document : texte imprimé Auteurs : Dustin BALDRIDGE, Auteur ; Levi KASTER, Auteur ; Catherine SANCIMINO, Auteur ; Siddharth SRIVASTAVA, Auteur ; Sophie MOLHOLM, Auteur ; Aditi GUPTA, Auteur ; Inez OH, Auteur ; Virginia LANZOTTI, Auteur ; Daleep GREWAL, Auteur ; Erin Rooney RIGGS, Auteur ; Juliann M. SAVATT, Auteur ; Rachel HAUCK, Auteur ; Abigail SVEDEN, Auteur ; BRAIN GENE REGISTRY CONSORTIUM, Auteur ; John N. CONSTANTINO, Auteur ; Joseph PIVEN, Auteur ; Christina A. GURNETT, Auteur ; Maya CHOPRA, Auteur ; Heather HAZLETT, Auteur ; Philip R.O. PAYNE, Auteur Langues : Anglais (eng) Mots-clés : Humans Male Female Autism Spectrum Disorder/genetics Autistic Disorder Neurodevelopmental Disorders Intellectual Disability Brain Registries Methyltransferases Brain gene registry Electronic health records Neurodevelopmental disorders Index. décimale : PER Périodiques Résumé : Monogenic disorders account for a large proportion of population-attributable risk for neurodevelopmental disabilities. However, the data necessary to infer a causal relationship between a given genetic variant and a particular neurodevelopmental disorder is often lacking. Recognizing this scientific roadblock, 13 Intellectual and Developmental Disabilities Research Centers (IDDRCs) formed a consortium to create the Brain Gene Registry (BGR), a repository pairing clinical genetic data with phenotypic data from participants with variants in putative brain genes. Phenotypic profiles are assembled from the electronic health record (EHR) and a battery of remotely administered standardized assessments collectively referred to as the Rapid Neurobehavioral Assessment Protocol (RNAP), which include cognitive, neurologic, and neuropsychiatric assessments, as well as assessments for attention deficit hyperactivity disorder (ADHD) and autism spectrum disorder (ASD). Co-enrollment of BGR participants in the Clinical Genome Resource's (ClinGen's) GenomeConnect enables display of variant information in ClinVar. The BGR currently contains data on 479 participants who are 55% male, 6% Asian, 6% Black or African American, 76% white, and 12% Hispanic/Latine. Over 200 genes are represented in the BGR, with 12 or more participants harboring variants in each of these genes: CACNA1A, DNMT3A, SLC6A1, SETD5, and MYT1L. More than 30% of variants are de novo and 43% are classified as variants of uncertain significance (VUSs). Mean standard scores on cognitive or developmental screens are below average for the BGR cohort. EHR data reveal developmental delay as the earliest and most common diagnosis in this sample, followed by speech and language disorders, ASD, and ADHD. BGR data has already been used to accelerate gene-disease validity curation of 36 genes evaluated by ClinGen's BGR Intellectual Disability (ID)-Autism (ASD) Gene Curation Expert Panel. In summary, the BGR is a resource for use by stakeholders interested in advancing translational research for brain genes and continues to recruit participants with clinically reported variants to establish a rich and well-characterized national resource to promote research on neurodevelopmental disorders. En ligne : https://dx.doi.org/10.1186/s11689-024-09530-3 Permalink : https://www.cra-rhone-alpes.org/cid/opac_css/index.php?lvl=notice_display&id=575
in Journal of Neurodevelopmental Disorders > 16 (2024)[article] The Brain Gene Registry: a data snapshot [texte imprimé] / Dustin BALDRIDGE, Auteur ; Levi KASTER, Auteur ; Catherine SANCIMINO, Auteur ; Siddharth SRIVASTAVA, Auteur ; Sophie MOLHOLM, Auteur ; Aditi GUPTA, Auteur ; Inez OH, Auteur ; Virginia LANZOTTI, Auteur ; Daleep GREWAL, Auteur ; Erin Rooney RIGGS, Auteur ; Juliann M. SAVATT, Auteur ; Rachel HAUCK, Auteur ; Abigail SVEDEN, Auteur ; BRAIN GENE REGISTRY CONSORTIUM, Auteur ; John N. CONSTANTINO, Auteur ; Joseph PIVEN, Auteur ; Christina A. GURNETT, Auteur ; Maya CHOPRA, Auteur ; Heather HAZLETT, Auteur ; Philip R.O. PAYNE, Auteur.
Langues : Anglais (eng)
in Journal of Neurodevelopmental Disorders > 16 (2024)
Mots-clés : Humans Male Female Autism Spectrum Disorder/genetics Autistic Disorder Neurodevelopmental Disorders Intellectual Disability Brain Registries Methyltransferases Brain gene registry Electronic health records Neurodevelopmental disorders Index. décimale : PER Périodiques Résumé : Monogenic disorders account for a large proportion of population-attributable risk for neurodevelopmental disabilities. However, the data necessary to infer a causal relationship between a given genetic variant and a particular neurodevelopmental disorder is often lacking. Recognizing this scientific roadblock, 13 Intellectual and Developmental Disabilities Research Centers (IDDRCs) formed a consortium to create the Brain Gene Registry (BGR), a repository pairing clinical genetic data with phenotypic data from participants with variants in putative brain genes. Phenotypic profiles are assembled from the electronic health record (EHR) and a battery of remotely administered standardized assessments collectively referred to as the Rapid Neurobehavioral Assessment Protocol (RNAP), which include cognitive, neurologic, and neuropsychiatric assessments, as well as assessments for attention deficit hyperactivity disorder (ADHD) and autism spectrum disorder (ASD). Co-enrollment of BGR participants in the Clinical Genome Resource's (ClinGen's) GenomeConnect enables display of variant information in ClinVar. The BGR currently contains data on 479 participants who are 55% male, 6% Asian, 6% Black or African American, 76% white, and 12% Hispanic/Latine. Over 200 genes are represented in the BGR, with 12 or more participants harboring variants in each of these genes: CACNA1A, DNMT3A, SLC6A1, SETD5, and MYT1L. More than 30% of variants are de novo and 43% are classified as variants of uncertain significance (VUSs). Mean standard scores on cognitive or developmental screens are below average for the BGR cohort. EHR data reveal developmental delay as the earliest and most common diagnosis in this sample, followed by speech and language disorders, ASD, and ADHD. BGR data has already been used to accelerate gene-disease validity curation of 36 genes evaluated by ClinGen's BGR Intellectual Disability (ID)-Autism (ASD) Gene Curation Expert Panel. In summary, the BGR is a resource for use by stakeholders interested in advancing translational research for brain genes and continues to recruit participants with clinically reported variants to establish a rich and well-characterized national resource to promote research on neurodevelopmental disorders. En ligne : https://dx.doi.org/10.1186/s11689-024-09530-3 Permalink : https://www.cra-rhone-alpes.org/cid/opac_css/index.php?lvl=notice_display&id=575

