Usage¶
Pipeline¶
The pipeline is the recommended way to use hiperhealth. It runs clinical stages independently through composable skills.
Running a single stage¶
from hiperhealth.pipeline import PipelineContext, Stage, create_default_runner
runner = create_default_runner()
ctx = PipelineContext(
patient={'symptoms': 'chest pain, shortness of breath', 'age': 45},
language='en',
session_id='visit-1',
)
ctx = runner.run(Stage.DIAGNOSIS, ctx)
print(ctx.results['diagnosis'].summary)
print(ctx.results['diagnosis'].options)
Running multiple stages¶
Persisting context between sessions¶
Stages can be executed at different times by different actors. Serialize the context to JSON between invocations:
# Monday — nurse runs screening
ctx = PipelineContext(
patient={'symptoms': 'Patient John has fever and cough', 'age': 30},
language='pt',
session_id='encounter-42',
)
runner = create_default_runner()
ctx = runner.run(Stage.SCREENING, ctx)
# Save to database, file, or message queue
saved_json = ctx.model_dump_json()
# Wednesday — physician restores context and runs diagnosis
ctx = PipelineContext.model_validate_json(saved_json)
ctx = runner.run(Stage.DIAGNOSIS, ctx)
Available stages¶
| Stage | Description |
|---|---|
screening |
Initial triage, PII de-identification |
intake |
Data extraction from reports and wearable files |
diagnosis |
LLM-powered differential diagnosis |
exam |
Exam/procedure suggestions based on diagnosis |
treatment |
Treatment planning (extensible via custom skills) |
prescription |
Prescription generation (extensible via custom skills) |
Built-in skills¶
The create_default_runner() factory registers three built-in skills in this
order:
| Skill | Stages | Description |
|---|---|---|
PrivacySkill |
screening, intake | De-identifies PII in patient data |
ExtractionSkill |
intake | Extracts text from medical reports and wearable data |
DiagnosticsSkill |
diagnosis, exam | LLM-powered diagnosis and exam suggestions |
Skills run in registration order, so PrivacySkill always runs before
ExtractionSkill within the same stage.
Installing and registering custom skills¶
Custom skills are typically published through channel repositories. Register a channel once, inspect its skills, then install and register the canonical skill id you want:
from hiperhealth.pipeline import SkillRegistry, create_default_runner
registry = SkillRegistry()
registry.add_channel(
'https://github.com/my-org/traditional-medicine.git',
local_name='tm',
)
registry.list_channels()
registry.list_channel_skills('tm')
registry.install_skill('tm.ayurveda')
runner = create_default_runner()
runner.register('tm.ayurveda', index=0)
In notebooks and scripts, registry.list_skills() is useful for exploring all
registered channel skills and built-ins from one place.
registry.list_skills()
registry.list_skills(channel='tm')
registry.list_skills(channel='tm', installed_only=True)
To compare results with and without a specific skill, temporarily disable it at the runner layer without uninstalling or unregistering it:
with runner.disabled({'tm.ayurveda'}):
ctx_without_ayurveda = runner.run(Stage.TREATMENT, ctx)
ctx_with_ayurveda = runner.run(Stage.TREATMENT, ctx)
For one-off calls, run() also accepts disabled_skills=.
For channel lifecycle operations such as update_channel(), remove_channel(),
install_channel(include_disabled=True), and the skills-channel.yaml /
skill.yaml manifest layout, see Creating Skills.
See Creating Skills for the full channel repository layout and manifest schema.
Session-based workflow¶
For multi-visit clinical scenarios, sessions provide a parquet-backed event log that persists the full interaction history. The calling system manages the session file lifecycle (storage, deletion, retention).
Creating and loading sessions¶
from hiperhealth.pipeline import Session
# Create a new session
session = Session.create('/data/sessions/patient-visit.parquet', language='en')
# Provide clinical data (no PII — only clinical information)
session.set_clinical_data({
'symptoms': 'chronic bloating, fatigue',
'age': 34,
'biological_sex': 'female',
})
# Load an existing session (e.g., days later)
session = Session.load('/data/sessions/patient-visit.parquet')
Checking requirements before execution¶
Skills declare what information they need before a stage can run. Use
check_requirements() to gather inquiries from all relevant skills:
from hiperhealth.pipeline import Session, Stage, create_default_runner
runner = create_default_runner()
session = Session.load('/data/sessions/patient-visit.parquet')
inquiries = runner.check_requirements(Stage.DIAGNOSIS, session)
for inq in inquiries:
print(f'[{inq.priority}] {inq.field}: {inq.label}')
Inquiries have three priority levels reflecting clinical data availability:
| Priority | Meaning | Example |
|---|---|---|
required |
Must have before this stage can run | Basic symptoms for diagnosis |
supplementary |
Would improve results, available now | Dietary history, medication list |
deferred |
Only available after a future pipeline step | Lab results (after exam stage) |
Providing answers and running stages¶
# Patient provides answers
session.provide_answers({'dietary_history': 'High carb, low fiber...'})
# Re-check — are required fields satisfied?
inquiries = runner.check_requirements(Stage.DIAGNOSIS, session)
required = [i for i in inquiries if i.priority == 'required']
if not required:
runner.run_session(Stage.DIAGNOSIS, session, llm=my_llm)
Multi-visit workflow¶
Not all data is available at the same time. A typical multi-visit flow:
# Visit 1: Preliminary diagnosis with available data
runner.run_session(Stage.DIAGNOSIS, session, llm=my_llm)
runner.run_session(Stage.EXAM, session, llm=my_llm) # requests lab work
# Visit 2: Lab results arrive, re-run with enriched data
session = Session.load('/data/sessions/patient-visit.parquet')
session.provide_answers({'stool_analysis': lab_results})
runner.run_session(Stage.DIAGNOSIS, session, llm=my_llm) # complete diagnosis
# Visit 3: Treatment plan
runner.check_requirements(Stage.TREATMENT, session)
runner.run_session(Stage.TREATMENT, session, llm=my_llm)
Inspecting session state¶
session.clinical_data # all patient data (merged from events)
session.results # stage results keyed by stage name
session.pending_inquiries # unanswered inquiries
session.stages_completed # which stages have run
session.events # raw event log
Interactive analysis in Jupyter notebooks¶
hiperhealth is designed to work as a data science framework for clinical analysis. Physicians can use it directly from Jupyter notebooks to study patient cases:
from hiperhealth.pipeline import Session, Stage, create_default_runner
runner = create_default_runner()
session = Session.create('/tmp/case-study.parquet')
session.set_clinical_data({
'symptoms': 'chronic fatigue, joint pain, morning stiffness',
'age': 52,
'biological_sex': 'female',
'family_history': 'rheumatoid arthritis (mother)',
})
# Check what information skills need
inquiries = runner.check_requirements(Stage.DIAGNOSIS, session)
for inq in inquiries:
print(f' [{inq.priority}] {inq.label}')
# Provide supplementary data interactively
session.provide_answers({
'rheumatoid_factor': 'positive, 45 IU/mL',
'anti_ccp': 'positive, 120 U/mL',
'esr': '38 mm/hr',
})
# Run diagnosis
runner.run_session(Stage.DIAGNOSIS, session, llm=my_llm)
print(session.results[Stage.DIAGNOSIS])
Analyzing session data with pandas or polars¶
The session parquet file is a standard parquet that can be queried directly:
import polars as pl
df = pl.read_parquet('/tmp/case-study.parquet')
# See all events
df
# Filter to specific event types
df.filter(pl.col('event_type') == 'inquiries_raised')
# See what stages have been completed
df.filter(pl.col('event_type') == 'stage_completed').select('stage', 'timestamp')
Diagnostics¶
The diagnostics helpers return LLMDiagnosis objects with:
summary: short summary textoptions: suggested diagnoses or exam/procedure names
Supported output languages are:
enptesfrit
Unknown language values fall back to English.
Differential diagnosis¶
from hiperhealth.skills.diagnostics.core import differential
patient = {
'age': 45,
'gender': 'M',
'symptoms': 'chest pain, shortness of breath',
'previous_tests': 'ECG normal',
}
result = differential(patient, language='en', session_id='demo-1')
print(result.summary)
print(result.options)
Suggested exams and procedures¶
from hiperhealth.skills.diagnostics.core import exams
result = exams(
['Acute coronary syndrome'],
language='en',
session_id='demo-1',
)
print(result.summary)
print(result.options)
Runtime configuration in code¶
from hiperhealth.skills.diagnostics.core import differential
from hiperhealth.llm import LLMSettings
settings = LLMSettings(
provider='ollama',
model='llama3.2:3b',
api_params={'base_url': 'http://localhost:11434/v1'},
)
result = differential(
{'symptoms': 'headache'},
llm_settings=settings,
)
Medical report extraction¶
Medical reports are extracted locally from PDF or image files. The extractor returns text and metadata, not FHIR resources.
Supported inputs:
pdfpngjpgjpeg
Example:
from hiperhealth.skills.extraction.medical_reports import (
MedicalReportFileExtractor,
)
extractor = MedicalReportFileExtractor()
report = extractor.extract_report_data(
'tests/data/reports/pdf_reports/report-1.pdf'
)
print(report['source_name'])
print(report['mime_type'])
print(report['text'][:200])
Returned payload keys:
source_namesource_typemime_typetext
If you only need the raw text:
Wearable data extraction¶
Wearable data extraction supports CSV and JSON inputs and returns a normalized list of dictionaries.
from hiperhealth.skills.extraction.wearable import WearableDataFileExtractor
extractor = WearableDataFileExtractor()
data = extractor.extract_wearable_data(
'tests/data/wearable/wearable_data.csv'
)
print(data[:2])
De-identification¶
from hiperhealth.skills.privacy.deidentifier import (
Deidentifier,
deidentify_patient_record,
)
engine = Deidentifier()
record = {
'symptoms': 'Patient John Doe reports severe headache.',
'mental_health': 'Lives at 123 Main St',
}
clean = deidentify_patient_record(record, engine)
print(clean)
Raw LLM output capture¶
Diagnostics responses are normalized and then written to data/llm_raw/ using
the supplied session_id when present.
Backward compatibility¶
The old import paths continue to work:
# These still work
from hiperhealth.agents.diagnostics.core import differential, exams
from hiperhealth.agents.extraction.medical_reports import MedicalReportFileExtractor
from hiperhealth.agents.extraction.wearable import WearableDataFileExtractor
from hiperhealth.privacy.deidentifier import Deidentifier
The canonical locations are now under hiperhealth.skills.* and
hiperhealth.pipeline.