기타 데이터
Allergenicity Prediction
-
AccessionKGD10764699
-
Submission date2026-06-17
Project Detail
|
Accession
|
KAP242381 |
|---|---|
|
프로젝트의 영문 제목
|
Plant-Derived Allergenic Protein Sequences
|
|
프로젝트의 국문 제목
|
- |
|
프로젝트의 영문 설명
|
A comprehensive plant protein dataset was constructed for allergenicity prediction by integrating plant-derived allergen sequences and non-allergen protein sequences from publicly available allergen databases and UniProt. The allergen dataset includes proteins from plant-related sources such as pollen, seeds, nuts, fruits, vegetables, cereals, legumes, and other plant taxa. Non-allergen proteins were collected from UniProt and assigned to the plant category based on organism-level taxonomy information. Where possible, non-allergen proteins were selected from the same or related plant-source organisms as those represented in the allergen dataset to reduce organism-level bias. This design helps the model focus on allergen-associated sequence features rather than simply distinguishing proteins by biological source. In total, 13,186 plant-derived protein sequences were compiled, including 1,927 allergen sequences and 11,259 non-allergen sequences, providing a curated resource for protein language model-based allergenicity prediction and explainable deep learning analysis.
|
|
프로젝트의 국문 설명
|
- |
BioSample
-
AccessionKAS24201784
-
생명체 명
Viridiplantae -
샘플 종류
Plant or fungi
Metadata
|
범주
|
기타(직접입력) - Protein Sequences |
|---|---|
|
인체유래데이터여부
|
NO |
|
제목
|
Allergenicity Prediction |
|
키워드
|
Allergenicity prediction, Protein language models, Explainable deep learning |
|
파일 설명
|
Header: ID, Label, Domain, Database Source; Seq: Protein sequence |
|
공개 날짜
|
2026-06-09 |
File