Perform a comprehensive structural biology analysis for: $ARGUMENTS

Use the MCP tools to gather structural data from all relevant sources, then synthesize a detailed structural report. Follow the steps below in order. If a step fails or returns no data, note the gap and continue.

Input Parsing

Determine the input type:

•PDB ID (4 characters, e.g., 1TUP, 6W9C) — start with PDB lookup, then identify the protein
•UniProt accession (e.g., P04637) — resolve gene symbol, then search PDB
•Gene symbol (e.g., TP53) — resolve to UniProt, then search PDB

Data Gathering Steps

1. Protein Identity

•If starting from a gene symbol, call uniprot_search with the symbol and organism_id:9606 (reviewed: true) to get the canonical UniProt accession.
•Call uniprot_get_protein with the accession to get protein length, function, and gene name.
•If starting from a PDB ID, call pdb_get_structure first, extract the UniProt mapping from entity details, then call uniprot_get_protein.

2. Domain Architecture

•Call interpro_get_domains with the UniProt accession.
•Extract all domains with their positions — these will be mapped onto structures later.
•Call uniprot_get_features with the UniProt accession to get detailed feature annotations (active sites, binding sites, disulfide bonds, metal binding).

3. All Experimental Structures (PDB)

•Call pdb_search with the gene symbol or protein name (limit: 20) to find all available structures.
•
For the top 5 structures by resolution, call pdb_get_structure to get detailed metadata:
- •Experimental method (X-ray, cryo-EM, NMR)
- •Resolution
- •Entity details (chains, sequence ranges covered)
- •Ligands and small molecules bound
- •Deposition and release dates
- •Source organism
•
Group structures by:
- •Apo structures (no ligands)
- •Ligand-bound (with drug/inhibitor/cofactor)
- •Complex structures (with DNA, RNA, or other proteins)
- •Mutant structures (engineered mutations)

4. Structure Coverage Analysis

•For each detailed PDB structure, note which residue ranges of the full-length protein are resolved.
•Identify gaps — regions of the protein with NO experimental structure coverage.
•Map domain boundaries onto PDB coverage to identify which domains have been structurally characterized.

5. AlphaFold Predicted Structure

•Call alphafold_get_prediction with the UniProt accession.
•
Extract:
- •Overall mean pLDDT confidence
- •Per-region confidence breakdown (very high >90, confident 70-90, low 50-70, very low <50)
- •Identify disordered regions (pLDDT < 50) — these are likely intrinsically disordered
•Compare AlphaFold coverage with PDB coverage — AlphaFold covers the full sequence, so note regions only predicted (not experimentally determined).

6. Pathogenic Variant Mapping

•Call clinvar_search with the gene symbol to get pathogenic/likely pathogenic variants.
•
Map variant positions onto:
- •Domains: Which domains harbor the most pathogenic variants?
- •Active/binding sites: Do any variants hit catalytic or functional residues?
- •Structural context: Are variants in structured (high pLDDT) or disordered regions?
•Identify mutation hotspots — positions or domains with multiple pathogenic variants.

7. Protein Interactions & Interfaces

•Call string_get_interactions with the gene symbol (species: 9606, limit: 10, required_score: 700).
•Cross-reference interaction partners with PDB complex structures — are any STRING partners seen in co-crystal structures?
•Note which structural regions are involved in protein-protein interfaces (from PDB entity details).

8. Literature

•Call pubmed_search with the gene symbol + "crystal structure" OR "cryo-EM structure" (max_results: 8).
•Focus on recent structural papers, especially those describing new structures, drug binding, or mechanistic insights.

Report Format

code

# Structural Biology Report: [PROTEIN NAME] ([GENE SYMBOL])

## Summary
One-paragraph overview: structural knowledge for this protein, key structural features, druggable pockets, and gaps in structural coverage.

## Protein Overview
- Gene: [symbol], Protein: [name], UniProt: [accession]
- Length: [N] amino acids
- Function: [brief description]
- Key domains: [list with positions]

## Domain Architecture
[1]---[Domain A: 50-150]---[Domain B: 200-350]---[N]
Table of all domains with InterPro/Pfam IDs, positions, and descriptions.

## Experimental Structures

### Structure Inventory
| PDB ID | Method | Resolution | Chains | Coverage | Ligands | Year |
|--------|--------|-----------|--------|----------|---------|------|
[All structures found, sorted by resolution]

**Total structures:** [N] | **Best resolution:** [X.X Å] | **Methods:** X-ray ([N]), Cryo-EM ([N]), NMR ([N])

### Key Structures Highlighted

#### Best Overall: [PDB ID] — [Title]
- Method: [X-ray/Cryo-EM], Resolution: [X.X Å]
- Coverage: residues [start]-[end] ([%] of protein)
- Key features: [what makes this structure notable]

#### Best Ligand-Bound: [PDB ID] — [Title]
- Ligand: [name/identifier]
- Binding site residues: [list]
- Drug relevance: [if applicable]

#### Best Complex: [PDB ID] — [Title]
- Complex partners: [proteins/DNA/RNA]
- Interface residues: [key contacts]

### Structure Coverage Map
Visual representation of which protein regions are covered by experimental structures:

Protein: [1]=========================[N] Domain A: [====50-150====] Domain B: [====200-350====] PDB 1XXX: [--30---------180--] PDB 2YYY: [--120--------300--] PDB 3ZZZ: [--280---350--] Gap: [1-29] [181-199] [351-N]

code


## AlphaFold Prediction
- Mean pLDDT: [score]
- High-confidence regions (pLDDT > 90): [ranges]
- Confident regions (70-90): [ranges]
- Low-confidence / disordered (< 50): [ranges]
- Comparison with PDB: [which gaps in PDB are predicted well by AlphaFold?]
- Model URLs: [PDB file], [CIF file]

## Pathogenic Variants on Structure

### Variant Hotspots
| Domain/Region | Variants | Significance |
|--------------|----------|-------------|
[Domains ranked by number of pathogenic variants]

### Key Pathogenic Variants
| Variant | Position | Domain | ClinVar Significance | Structural Context |
|---------|----------|--------|---------------------|-------------------|
[Top 10 pathogenic variants with structural interpretation]

**Interpretation:** [Where do pathogenic variants cluster? Do they affect active sites, stability, or interfaces?]

## Protein Interaction Interfaces
- Top interaction partners (STRING):
  | Partner | Score | Co-crystal Structure? |
  |---------|-------|--------------------|
- Structurally characterized interfaces: [which interactions have PDB structures?]
- Key interface residues: [from complex structures]

## Structural Literature
| # | Title | Year | Journal | PMID |
|---|-------|------|---------|------|
[Top 5-8 structural papers]

## Druggability Assessment
- Known ligand-binding sites from PDB structures
- Druggable pockets: [based on ligand-bound structures]
- Allosteric sites: [if any structures reveal allosteric mechanisms]
- Drug candidates: [if any PDB ligands are known drugs or clinical candidates]

## Key Takeaways
- [3-5 bullet points summarizing structural knowledge, gaps, variant hotspots, and druggability]

## Data Sources
List which databases were queried and whether each returned data successfully.

Keep the report factual — only include data returned by the tools. Do not fabricate PDB IDs, resolutions, or structural details. If a section has no data, write "No data available from [source]."