Aneesh
Mazumder

with Large Language Model-Driven Knowledge Discovery

Abstract profile. Full document pending author claim.

Authors:

Aneesh Mazumder, Haoqi Sun, Martin Zhang

Date Created:

2025-01-01

Course Title:
Professor:

Not specified

About Paper:

Although Alzheimer’s disease (AD) is defined by classical from 10 randomly sampled full-text papers. pathologies, novel molecular pathways like histone deacetylase 6 (HDAC6) are increasingly linked to its progression. The GPT-4.1 mini significantly outperformed both Gemini 2.5 Flash and DeepSeek Chat, with accuracy scores of 54.73 % and 55.75 expanding literature on these pathways necessitates systematic %, respectively, as reported by Reviewer 1 and Reviewer 2. methods for analyzing complex molecular relationships. Large DeepSeek Chat achieved 26.72 % and 26.47 %, while Gemini 2.5 language models (LLMs) provide a powerful means of extracting and organizing knowledge by automatically interpreting scientific Flash scored 11.52 % and 11.70 %. Cohen’s Kappa values were texts. These insights can be structured into knowledge graphs consistently high across all models (0.969 for GPT-4.1 mini, 0.947 forDeepSeekChat,and0.975forGemini),confirmingstronginter- (KGs) that map molecular interactions in AD. rater reliability. To build a KG centering on HDAC6’s role in AD, we extracted a corpus comprising 125 full-text PubMed papers. Preprocessing GPT-4.1 mini demonstrated the highest accuracy among the tested models in extracting HDAC6-related molecular triplets from the involved abbreviation expansion and coreference resolution. AD literature, enabling the efficient construction of a structured Three large language models, including GPT-4.1 mini, Gemini knowledge graph. This graph will provide a foundational 2.5 Flash, and DeepSeek Chat, were used to process each sentence individually and extract subject-predicate-object triplets,amework for mapping known interactions and identifying gaps constrained so that both subject and object were molecular entities,at may indicate novel therapeutic targets. To extend this such as proteins or small molecules. Model performance was framework, graph neural networks will be applied to predict new HDAC6-related interactions, supporting the discovery of evaluated using human-assessed accuracy on triplets extracted previously unknown molecular mechanisms in AD. Clustering Analysis of DESI Luminous Red Galaxies using Hyper Suprime- Cam Photometry Gage Miller, Daniel Eisenstein Harvard College | Adams House | Astrophysics | 2027 Galaxy clustering presents a valuable tool for probing the large- and luminosity. DESI LRGs were split into a sample considered scale structure of the universe. This study aims to measure to be in dense environments and a sample in sparse environments differences in galaxy clustering among luminous red galaxies relative to the median number of faint red neighbors. The two- based on the densities of their large-scale environments. This point correlation function was then measured for each subset was done by utilizing the Hyper Suprime-Cam Subaru Strategic and compared to that of the other sample. We expect LRGs in Program’s Public Data Release 3 photometric catalogs to count denselypopulatedregions toshow higher levels ofclustering when “faint red neighbors” of luminous red galaxies in the Dark Energy compared to those in sparsely populated regions. This study shows Spectroscopic Instrument survey’s DR1 Iron data release. In the potential of combining wide spectroscopy from surveys like total, this study includes 600 square degrees on the sky. This DESI with deep imaging surveys such as Hyper-Suprime Cam area contains upwards of 1,800,000 LRGs and 120,000,000 HSC or the highly anticipated Vera C. Rubin Observatory for probing objects. Cuts were made to the set of HSC objects based on color properties of the large-scale structure of the universe. 114 Program for Research in Science and Engineering Associations Between Maternal Childhood Maltreatment, Infant Epigenetic

Abstract:

Although Alzheimer’s disease (AD) is defined by classical from 10 randomly sampled full-text papers. pathologies, novel molecular pathways like histone deacetylase 6 (HDAC6) are increasingly linked to its progression. The GPT-4.1 mini significantly outperformed both Gemini 2.5 Flash and DeepSeek Chat, with accuracy scores of 54.73 % and 55.75 expanding literature on these pathways necessitates systematic %, respectively, as reported by Reviewer 1 and Reviewer 2. methods for analyzing complex molecular relationships. Large DeepSeek Chat achieved 26.72 % and 26.47 %, while Gemini 2.5 language models (LLMs) provide a powerful means of extracting and organizing knowledge by automatically interpreting scientific Flash scored 11.52 % and 11.70 %. Cohen’s Kappa values were texts. These insights can be structured into knowledge graphs consistently high across all models (0.969 for GPT-4.1 mini, 0.947 forDeepSeekChat,and0.975forGemini),confirmingstronginter- (KGs) that map molecular interactions in AD. rater reliability. To build a KG centering on HDAC6’s role in AD, we extracted a corpus comprising 125 full-text PubMed papers. Preprocessing GPT-4.1 mini demonstrated the highest accuracy among the tested models in extracting HDAC6-related molecular triplets from the involved abbreviation expansion and coreference resolution. AD literature, enabling the efficient construction of a structured Three large language models, including GPT-4.1 mini, Gemini knowledge graph. This graph will provide a foundational 2.5 Flash, and DeepSeek Chat, were used to process each sentence individually and extract subject-predicate-object triplets,amework for mapping known interactions and identifying gaps constrained so that both subject and object were molecular entities,at may indicate novel therapeutic targets. To extend this such as proteins or small molecules. Model performance was framework, graph neural networks will be applied to predict new HDAC6-related interactions, supporting the discovery of evaluated using human-assessed accuracy on triplets extracted previously unknown molecular mechanisms in AD. Clustering Analysis of DESI Luminous Red Galaxies using Hyper Suprime- Cam Photometry Gage Miller, Daniel Eisenstein Harvard College | Adams House | Astrophysics | 2027 Galaxy clustering presents a valuable tool for probing the large- and luminosity. DESI LRGs were split into a sample considered scale structure of the universe. This study aims to measure to be in dense environments and a sample in sparse environments differences in galaxy clustering among luminous red galaxies relative to the median number of faint red neighbors. The two- based on the densities of their large-scale environments. This point correlation function was then measured for each subset was done by utilizing the Hyper Suprime-Cam Subaru Strategic and compared to that of the other sample. We expect LRGs in Program’s Public Data Release 3 photometric catalogs to count denselypopulatedregions toshow higher levels ofclustering when “faint red neighbors” of luminous red galaxies in the Dark Energy compared to those in sparsely populated regions. This study shows Spectroscopic Instrument survey’s DR1 Iron data release. In the potential of combining wide spectroscopy from surveys like total, this study includes 600 square degrees on the sky. This DESI with deep imaging surveys such as Hyper-Suprime Cam area contains upwards of 1,800,000 LRGs and 120,000,000 HSC or the highly anticipated Vera C. Rubin Observatory for probing objects. Cuts were made to the set of HSC objects based on color properties of the large-scale structure of the universe. 114 Program for Research in Science and Engineering Associations Between Maternal Childhood Maltreatment, Infant Epigenetic

Source:

Harvard / Amelie Martin, Cade Kane, Noel Michele Holbrook / 2025

Topics:

large, model, molecular, red, galaxy, knowledge, hdac6, gpt, mini, graph, language, gemini

Professor Score
92.5
Verified
Dora Michaelides
0
Giulia Monti
0
Anna Greka
0
Catherine Dulac
0
Harris Kaplan
0
Jeprika Rodriguez
0
Ryan W. Castro
0
Se Hoon Choi
0