본문으로 바로가기

기타 데이터
Allergenicity Prediction

  • Accession
    KGD10764698
  • Submission date
    2026-06-17

Project Detail
Dataset detail - Accession, 프로젝트의 영문 제목, 프로젝트의 국문 제목, 프로젝트의 영문 설명, 프로젝트의 국문 설명
Accession
KAP242380
프로젝트의 영문 제목
Animal-Derived Allergenic Protein Sequences
프로젝트의 국문 제목
-
프로젝트의 영문 설명
A comprehensive animal protein dataset was constructed for allergenicity prediction by integrating animal-derived allergen sequences and non-allergen protein sequences from publicly available allergen databases and UniProt. The allergen dataset includes proteins from animal-related sources such as mites, insects, milk, egg, fish, shellfish, meat, venom-producing organisms, and other animal taxa. Non-allergen proteins were collected from UniProt and assigned to the animal category based on organism-level taxonomy information. Where possible, non-allergen proteins were selected from the same or related animal-source organisms as those represented in the allergen dataset to reduce organism-level bias. This design helps the model focus on allergen-associated sequence features rather than simply distinguishing proteins by biological source. In total, 36,054 animal-derived protein sequences were compiled, including 1,480 allergen sequences and 34,574 non-allergen sequences, providing a curated resource for protein language model-based allergenicity prediction and explainable deep learning analysis.
프로젝트의 국문 설명
-

BioSample
  • Accession
    KAS24201783
  • 생명체 명
    Metazoa
  • 샘플 종류
    Model organism or animal

Metadata
범주
기타(직접입력) - Protein Sequences
인체유래데이터여부
NO
제목
Allergenicity Prediction
키워드
Allergenicity prediction, Protein language models, Explainable deep learning
파일 설명
Header: ID, Label, Domain, Database Source; Seq: Protein sequence
공개 날짜
2026-06-09

File