Creating Skills¶
Skills are composable plugins that extend the clinical pipeline. Each skill can
affect one or more stages and is a Python class that subclasses BaseSkill.
Architecture overview¶
StageRunner
|
+-- PrivacySkill (screening, intake)
+-- ExtractionSkill (intake)
+-- DiagnosticsSkill (diagnosis, exam)
+-- YourCustomSkill (diagnosis, treatment)
When a stage runs, the runner finds all skills that declare that stage in their
metadata and calls their hooks in registration order (the order you pass
them to StageRunner):
- Optional assessment pass via
StageRunner.check_requirements(stage, session)which callscheck_requirements()for matching skills in registration order - All
pre()hooks (in registration order) - All
execute()hooks (in registration order) - All
post()hooks (in registration order)
The system integrator controls execution order — not the skill author.
Skill project structure¶
Every skill project is a directory containing at minimum a skill.yaml metadata
file and a Python module with the skill class:
my_skill/
├── skill.yaml # required: skill metadata
├── skill.py # required: contains the BaseSkill subclass
├── prompts/ # optional: prompt templates
│ └── diagnosis.txt
├── data/ # optional: reference data, lookup tables
│ └── herbs.json
└── requirements.txt # optional: extra pip dependencies
skill.yaml¶
# Required fields
name: my_org.skill_name # unique identifier
version: 1.0.0 # semver
entry_point: "skill:MySkillClass" # module:ClassName within the folder
stages:
- diagnosis
- treatment
# Human-readable (optional)
description: >-
A brief description of what this skill does.
author: "Your Name <email@example.com>"
license: MIT
homepage: https://github.com/my_org/my_skill
# Compatibility (optional)
min_hiperhealth_version: "0.4.0"
# Extra pip dependencies this skill needs (optional)
dependencies:
- some-package>=1.0
| Field | Required | Description |
|---|---|---|
name |
yes | Unique skill identifier. Used in register("name") |
version |
yes | Semver string |
entry_point |
yes | module:ClassName relative to the skill folder |
stages |
yes | List of stage names this skill participates in |
description |
no | Human-readable description |
author |
no | Author name and contact |
license |
no | License identifier |
homepage |
no | URL for documentation or source |
min_hiperhealth_version |
no | Minimum compatible hiperhealth version |
dependencies |
no | Extra pip packages the skill requires |
Minimal skill¶
from hiperhealth.pipeline import BaseSkill, SkillMetadata, Stage
from hiperhealth.pipeline.context import PipelineContext
class GreetingSkill(BaseSkill):
def __init__(self):
super().__init__(
SkillMetadata(
name='my_org.greeting',
version='1.0.0',
stages=(Stage.SCREENING,),
description='Adds a greeting to the context.',
)
)
def execute(self, stage, ctx):
name = ctx.patient.get('name', 'Patient')
ctx.extras['greeting'] = f'Welcome, {name}!'
return ctx
SkillMetadata fields¶
| Field | Type | Default | Description |
|---|---|---|---|
name |
str |
required | Unique identifier, e.g. my_org.skill_name |
version |
str |
"0.1.0" |
Semantic version of the skill |
stages |
tuple[str, ...] |
() |
Which stages this skill participates in |
description |
str |
"" |
Human-readable description |
Hooks¶
Each skill has four hooks. Override only the ones you need — the base class provides no-op defaults.
check_requirements(stage, ctx) -> list[Inquiry]¶
Called before execution to determine what information is needed. Return a list
of Inquiry objects describing what data the skill needs. The default returns
an empty list (no extra data needed).
In the session workflow, callers do not invoke this hook directly. They call
StageRunner.check_requirements(stage, session, **kwargs), which:
- Builds
ctxfromsession.to_context() - Merges both
session.set_clinical_data()andsession.provide_answers()intoctx.patient - Stores extra keyword arguments in
ctx.extras['_run_kwargs'] - Records
check_requirements_started,inquiries_raised, andcheck_requirements_completedevents in the session file
That means skill authors should treat ctx.patient as the current merged
clinical state and only return inquiries for fields that are still missing.
ctx.results also contains outputs from previously completed stages. The only
thing not passed into the hook is the raw session event log itself.
Each inquiry has a priority reflecting clinical data availability:
| Priority | Meaning | Example |
|---|---|---|
required |
Must have before this stage can run | Basic symptoms for diagnosis |
supplementary |
Would improve results, available now | Dietary history, medication list |
deferred |
Only available after a future pipeline step | Lab results (after exam stage) |
Example:
import json
from typing import Literal
from pydantic import BaseModel
from hiperhealth.agents.client import chat_structured
from hiperhealth.pipeline import BaseSkill, Inquiry, SkillMetadata, Stage
from hiperhealth.pipeline.context import PipelineContext
class InquiryDraft(BaseModel):
field: str
label: str
description: str = ''
priority: Literal['required', 'supplementary', 'deferred'] = (
'supplementary'
)
input_type: str = 'text'
choices: list[str] | None = None
class InquiryDraftList(BaseModel):
inquiries: list[InquiryDraft]
class GutMicrobiomeSkill(BaseSkill):
def __init__(self):
super().__init__(SkillMetadata(
name='gut_microbiome',
stages=(Stage.DIAGNOSIS, Stage.TREATMENT),
))
def check_requirements(
self, stage: str, ctx: PipelineContext
) -> list[Inquiry]:
if stage != Stage.DIAGNOSIS:
return []
run_kwargs = ctx.extras.get('_run_kwargs', {})
llm = run_kwargs.get('llm')
llm_settings = run_kwargs.get('llm_settings')
system_prompt = (
'You are a clinical assistant specialized in gut microbiome care. '
'Review the full anamnesis and prior stage outputs. '
'Return only the additional information that would be most useful '
'for this skill and is not already present. '
'Prioritize safety-critical items as "required", useful but '
'non-blocking items as "supplementary", and items that naturally '
'arrive later as "deferred".'
)
payload = {
'patient': ctx.patient,
'results': ctx.results,
'stage': stage,
}
response = chat_structured(
system_prompt,
json.dumps(payload, ensure_ascii=False),
InquiryDraftList,
session_id=ctx.session_id,
llm=llm,
llm_settings=llm_settings,
)
existing_fields = set(ctx.patient.keys())
return [
Inquiry(
skill_name=self.metadata.name,
stage=stage,
field=item.field,
label=item.label,
description=item.description,
priority=item.priority,
input_type=item.input_type,
choices=item.choices,
)
for item in response.inquiries
if item.field not in existing_fields
]
Requirement / answer loop¶
provide_answers() does not call any skill hooks by itself. It appends an
answers_provided event, and those fields become part of
session.clinical_data the next time the runner builds a context.
from hiperhealth.pipeline import Session, Stage, StageRunner
runner = StageRunner(skills=[GutMicrobiomeSkill()])
session = Session.create('/tmp/case.parquet')
session.set_clinical_data({'symptoms': 'bloating'})
inquiries = runner.check_requirements(Stage.DIAGNOSIS, session)
session.provide_answers({'dietary_history': 'high carb, low fiber'})
# ctx.patient will now include both symptoms and dietary_history
inquiries = runner.check_requirements(Stage.DIAGNOSIS, session)
required = [i for i in inquiries if i.priority == 'required']
if not required:
runner.run_session(Stage.DIAGNOSIS, session)
You can inspect session.pending_inquiries at any point to see which recorded
inquiries still do not have matching fields in session.clinical_data.
pre(stage, ctx) -> PipelineContext¶
Called before the main execution. Use it to prepare data, inject prompt fragments, or validate preconditions.
execute(stage, ctx) -> PipelineContext¶
The main work of the skill. Read from ctx.patient, ctx.results, or
ctx.extras, and write results to ctx.results[stage].
post(stage, ctx) -> PipelineContext¶
Called after execution. Use it for logging, cleanup, or result transformation.
PipelineContext¶
The context is a Pydantic model that flows between stages:
class PipelineContext(BaseModel):
patient: dict[str, Any] = {} # Patient data
language: str = 'en' # Prompt language
session_id: str | None = None # Session tracking
results: dict[str, Any] = {} # Stage results, keyed by stage name
audit: list[AuditEntry] = [] # Execution audit log
extras: dict[str, Any] = {} # Skill-specific data, prompt fragments
Serialization¶
The context serializes to JSON for persistence between invocations:
# Save
json_str = ctx.model_dump_json()
# Restore
ctx = PipelineContext.model_validate_json(json_str)
This allows stages to run hours or days apart, by different actors.
Modifying prompts from a skill¶
Skills can inject additional instructions into the prompts used by
DiagnosticsSkill via prompt fragments:
class AyurvedaSkill(BaseSkill):
def __init__(self):
super().__init__(
SkillMetadata(
name='ayurveda',
stages=(Stage.DIAGNOSIS, Stage.TREATMENT),
)
)
def pre(self, stage, ctx):
fragments = ctx.extras.setdefault('prompt_fragments', {})
fragments[stage] = (
'Also consider Ayurvedic perspectives and traditional '
'Indian medicine approaches.'
)
return ctx
The DiagnosticsSkill checks ctx.extras['prompt_fragments'] in two places:
{stage}for the main execution prompt{stage}_requirementsfor the requirement-gathering prompt used bycheck_requirements()
Skill registry¶
The SkillRegistry treats a repository or folder as a channel and a folder
inside skills/ as an installable skill. Channel-based skills use canonical
ids in the form <local_channel_name>.<skill_name>, such as tm.ayurveda.
Recommended channel repository layout¶
channel-repo/
├── skills-channel.yaml
├── README.md
├── LICENSE
├── docs/
├── infra/
├── scripts/
├── shared/
├── skills/
│ ├── ayurveda/
│ │ ├── skill.yaml
│ │ └── skill.py
│ ├── nutrition/
│ │ ├── skill.yaml
│ │ └── skill.py
│ └── triage/
│ ├── skill.yaml
│ └── skill.py
└── tests/
Only skills explicitly declared in skills-channel.yaml are installable.
Channel metadata does not need discovery, ignore, path, or manifest
fields, because every declared skill is resolved automatically to
skills/<name>/skill.yaml.
skills-channel.yaml¶
api_version: 1
channel:
name: traditional-medicine
display_name: Traditional Medicine
default_alias: tm
version: 0.1.0
description: Complementary and traditional medicine skills
homepage: https://github.com/my-org/traditional-medicine
license: BSD-3-Clause
min_hiperhealth_version: ">=0.5.0"
skills:
- name: ayurveda
enabled: true
tags: [traditional-medicine, treatment]
- name: nutrition
enabled: true
tags: [nutrition]
- name: triage
enabled: false
tags: [screening]
Each declared skill is expected to live at skills/<name>/skill.yaml, which
keeps the per-skill manifest format explicit while removing repeated path
boilerplate from the channel manifest.
Python API for scripts and notebooks¶
from hiperhealth.pipeline import SkillRegistry
registry = SkillRegistry()
registry.add_channel(
'https://github.com/my-org/traditional-medicine.git',
local_name='tm',
)
registry.list_channels()
registry.list_channel_skills('tm')
registry.list_skills()
registry.list_skills(channel='tm')
registry.install_skill('tm.ayurveda')
registry.install_channel('tm')
registry.update_skill('tm.ayurveda')
registry.remove_skill('tm.ayurveda')
Channel registration and source detection¶
SkillRegistry.add_channel() accepts:
- Local folder paths whose root contains
skills-channel.yaml - GitHub URLs
- GitLab URLs
- Generic git URLs
The source root is interpreted with these rules:
- If the root contains
skills-channel.yaml, it is treated as a channel. - If the root does not contain
skills-channel.yaml, registration fails with a validation error. - Local folders are copied as-is into the local registry checkout.
- Remote git sources are cloned into the local registry checkout.
ref=is supported for remote git sources only.
Local channel aliases follow these rules:
local_namecan be provided explicitly, or derived fromchannel.default_alias, or finally fromchannel.name.- Local aliases must be unique within the local registry.
- Local aliases may contain letters, numbers,
_, and-. - Local aliases cannot contain
.because canonical skill ids use<local_name>.<skill_name>. - The alias
hiperhealthis reserved for built-in skills.
Full Python API¶
The public registry API is intended to be comfortable in scripts and notebooks:
from hiperhealth.pipeline import SkillRegistry
registry = SkillRegistry()
registry.add_channel(
'https://github.com/my-org/traditional-medicine.git',
local_name='tm',
ref='main',
)
registry.add_channel('/srv/system-x/channels/traditional-medicine')
registry.list_channels()
registry.list_channel_skills('tm')
registry.update_channel('tm')
registry.update_channel('tm', ref='main')
registry.remove_channel('tm')
registry.list_skills()
registry.list_skills(channel='tm')
registry.list_skills(channel='tm', installed_only=True)
registry.install_skill('tm.ayurveda')
registry.install_channel('tm')
registry.install_channel('tm', include_disabled=True)
registry.update_skill('tm.ayurveda')
registry.update_skill('tm.ayurveda', pull_channel=True)
registry.remove_skill('tm.ayurveda')
registry.load('tm.ayurveda')
StageRunner.register() uses the same canonical ids:
from hiperhealth.pipeline import StageRunner, create_default_runner
runner = create_default_runner()
runner.register('tm.ayurveda', index=0)
Built-in skills continue to use the built-in canonical ids:
runner.register('hiperhealth.privacy')
runner.register('hiperhealth.extraction')
runner.register('hiperhealth.diagnostics')
External channel skills must be installed before load() or
StageRunner.register() can activate them.
CLI¶
The CLI is a thin wrapper over the same registry API:
hiperhealth channel add https://github.com/my-org/traditional-medicine.git --name tm
hiperhealth channel add /srv/system-x/channels/traditional-medicine --name tm
hiperhealth channel list
hiperhealth channel skills tm
hiperhealth channel update tm
hiperhealth channel update tm --ref main
hiperhealth channel remove tm
hiperhealth channel install tm --all
hiperhealth channel install tm --all --include-disabled
hiperhealth skill list --channel tm
hiperhealth skill install tm.ayurveda
hiperhealth skill update tm.ayurveda --pull
hiperhealth skill remove tm.ayurveda
Update semantics¶
Channel and skill updates are intentionally separate:
registry.update_channel('tm')refreshes the local channel checkout from its original source, then refreshes every currently installed skill from that channel.registry.update_skill('tm.ayurveda')refreshes one installed skill against the current local checkout of channeltmwithout first refreshing the channel source.registry.update_skill('tm.ayurveda', pull_channel=True)first refreshes the owning channel checkout, then updates the installed skill metadata.hiperhealth channel update tmis the CLI equivalent ofregistry.update_channel('tm').hiperhealth skill update tm.ayurveda --pullis the CLI equivalent ofregistry.update_skill('tm.ayurveda', pull_channel=True).
Validation and error behavior¶
The registry enforces these rules:
- All skill names in a channel must be unique within that channel.
- Every declared skill must have a matching
skills/<name>/directory. - Every declared skill must have a matching
skills/<name>/skill.yaml. - Each declared skill manifest must be a valid
skill.yaml. - Built-in skills cannot be installed or removed with channel skill commands.
- Channel skill operations use canonical ids such as
tm.ayurveda. - Skill names only need to be unique within a channel, not globally.
Local storage layout¶
~/.hiperhealth/
├── registry/
│ ├── channels.json
│ ├── skills.json
│ └── state.json
├── channels/
│ └── tm/
│ ├── repo/
│ ├── skills-channel.yaml
│ └── channel.json
└── artifacts/
└── skills/
Each registered channel keeps a single local checkout under
~/.hiperhealth/channels/<local_name>/repo. Installed channel skills reference
that checkout instead of copying the full repository per skill.
Channel-only registry sources¶
The external registry flow is channel-based:
- Use
registry.add_channel(...)for either a local folder or a remote git source. - Use
registry.install_skill(...)orregistry.install_channel(...)to activate skills from that channel. - The channel root must contain
skills-channel.yaml. - Each skill folder must contain
skill.yaml.
Using the runner¶
Register skills at construction time¶
The list order defines execution order:
from hiperhealth.pipeline import StageRunner, Stage
runner = StageRunner(skills=[
PrivacySkill(), # runs first
ExtractionSkill(), # runs second
DiagnosticsSkill(), # runs third
AyurvedaSkill(), # runs last
])
ctx = runner.run(Stage.DIAGNOSIS, ctx)
Run multiple stages¶
Pass extra arguments¶
Extra keyword arguments to run() are available to skills via
ctx.extras['_run_kwargs']:
Temporarily disable skills¶
Use runner.disabled(...) when you want to compare a stage with and without
specific registered skills, without uninstalling them or changing the registry:
with runner.disabled({'tm.ayurveda'}):
ctx_without_ayurveda = runner.run(Stage.TREATMENT, ctx)
ctx_with_ayurveda = runner.run(Stage.TREATMENT, ctx)
For one-off calls, run(), run_many(), run_session(), and
check_requirements() also accept disabled_skills=.
Stages¶
The built-in stages are defined as a string enum:
| Stage | Value | Typical use |
|---|---|---|
Stage.SCREENING |
"screening" |
Initial triage, PII de-identification |
Stage.INTAKE |
"intake" |
Data extraction from files |
Stage.DIAGNOSIS |
"diagnosis" |
Differential diagnosis |
Stage.EXAM |
"exam" |
Exam/procedure suggestions |
Stage.TREATMENT |
"treatment" |
Treatment planning |
Stage.PRESCRIPTION |
"prescription" |
Prescription generation |
Custom string stage names also work — the runner accepts any string, not only enum values.
Skill discovery via entry points¶
Third-party skills can also be auto-discovered if they register as Python entry points:
# In the skill package's pyproject.toml
[project.entry-points."hiperhealth.skills"]
ayurveda = "my_package:AyurvedaSkill"
Then discover and use them:
from hiperhealth.pipeline import discover_skills, StageRunner
third_party = discover_skills()
runner = StageRunner(skills=third_party)
Example: full custom skill¶
Here is a complete example of a skill that adds intake data enrichment:
from hiperhealth.pipeline import BaseSkill, SkillMetadata, Stage
from hiperhealth.pipeline.context import PipelineContext
class BMICalculatorSkill(BaseSkill):
"""Calculates BMI from height and weight in patient data."""
def __init__(self):
super().__init__(
SkillMetadata(
name='my_clinic.bmi_calculator',
version='1.0.0',
stages=(Stage.INTAKE,),
description='Calculates BMI from patient height and weight.',
)
)
def execute(self, stage, ctx):
height = ctx.patient.get('height_m')
weight = ctx.patient.get('weight_kg')
if height and weight and height > 0:
bmi = weight / (height ** 2)
intake = ctx.results.setdefault(Stage.INTAKE, {})
intake['bmi'] = round(bmi, 1)
intake['bmi_category'] = self._categorize(bmi)
return ctx
def _categorize(self, bmi):
if bmi < 18.5:
return 'underweight'
elif bmi < 25:
return 'normal'
elif bmi < 30:
return 'overweight'
return 'obese'
With skill.yaml:
name: my_clinic.bmi_calculator
version: 1.0.0
entry_point: "skill:BMICalculatorSkill"
stages:
- intake
description: Calculates BMI from patient height and weight.
Usage:
from hiperhealth.pipeline import (
PipelineContext, SkillRegistry, Stage, create_default_runner,
)
# Register the channel and install the skill
registry = SkillRegistry()
registry.add_channel('/path/to/my_clinic_channel_repo', local_name='clinic')
registry.install_skill('clinic.bmi_calculator')
# Use it in a pipeline
runner = create_default_runner()
runner.register('clinic.bmi_calculator')
ctx = PipelineContext(
patient={'height_m': 1.75, 'weight_kg': 70},
)
ctx = runner.run(Stage.INTAKE, ctx)
print(ctx.results['intake']['bmi']) # 22.9
print(ctx.results['intake']['bmi_category']) # normal
Testing skills¶
Skills are plain Python classes, so they are straightforward to test:
from hiperhealth.pipeline import PipelineContext, Stage, StageRunner
def test_bmi_calculator():
skill = BMICalculatorSkill()
runner = StageRunner(skills=[skill])
ctx = PipelineContext(
patient={'height_m': 1.80, 'weight_kg': 90},
)
ctx = runner.run(Stage.INTAKE, ctx)
assert ctx.results[Stage.INTAKE]['bmi'] == 27.8
assert ctx.results[Stage.INTAKE]['bmi_category'] == 'overweight'