Use Site-Energy-Difference Models¶

This guide shows how to use site-energy-difference models inside a kMCpy KMC simulation, including direct Python callables and external codes such as smol, CLEASE, ASE, or a project-specific cluster expansion.

The main rule is simple:

delta_e_site = E_after_hop - E_before_hop

The site model must return that signed energy difference for the proposed event. kMCpy combines it with the KRA barrier as:

effective_barrier = e_kra + delta_e_site / 2

In kMCpy, barriers and site-energy differences are consumed in meV. If a callable or external code returns eV, set units="eV" and kMCpy will convert it.

Choose The Interface¶

Use one of these options:

Use case	Interface
No site-energy-difference term	`site_model=None`
A kMCpy `LocalClusterExpansion` supplies the site-energy difference	`LocalClusterExpansion` as `site_model`
A Python function or external runtime supplies the site-energy difference	`SiteEnergyModel`

SiteEnergyModel handles both simple kMCpy-native callables and mapped external codes. If no mapping is supplied, it uses kMCpy’s active-site order and occupation labels directly. If an external code has its own site order or state labels, provide site_mapping and state_mapping; kMCpy builds the mapping once before the KMC loop and then updates only event endpoints.

When the site contribution is another kMCpy LocalClusterExpansion, it still uses the regular compute(simulation_state=..., event=...) method. Its role in CompositeLCEModel determines the meaning: as kra_model it returns E_KRA, and as site_model it returns the site-energy-difference contribution. SiteEnergyModel uses the same public compute(...) method and returns E_after_hop - E_before_hop directly.

What kMCpy Stores¶

kMCpy stores the simulation occupation as a compact active-site list:

simulation_state.occupations

For a binary mobile-ion/vacancy model this often looks like:

0 = mobile ion
1 = vacancy

For multicomponent models, states may be 0, 1, 2, etc. These integers are kMCpy state labels. External codes often use a different site order and different occupation labels, so you must provide mappings.

Optional Mappings¶

Mappings are only needed when an external code uses a different site order or different occupation labels.

site_mapping maps from kMCpy active-site index to the external code’s site index:

site_mapping = {
    0: 12,  # kMCpy active site 0 is external site 12
    1: 18,
    2: 19,
}

state_mapping maps from kMCpy state index to the external occupation value:

state_mapping = {
    0: 0,  # kMCpy mobile-ion state -> smol occupation code
    1: 1,  # kMCpy vacancy state -> smol occupation code
    2: 2,  # another species
}

For CLEASE or ASE, the external value may be a symbol:

state_mapping = {
    0: "Li",
    1: "X",
}

If the same kMCpy state number means different external values on different sublattices, use state_mapping_by_site:

state_mapping_by_site = {
    0: {0: 0, 1: 1},
    1: {0: 3, 1: 4},
}

kMCpy validates the site mapping during initialization. If an event_lib is available, it also checks that the endpoint state mappings needed by the events are present before the KMC loop starts.

During initialization, readable dictionaries/lists are converted to runtime lookup arrays:

site_lookup[kmcpy_active_site] -> external_site
state_lookup[kmcpy_state] -> external_occupation_value
state_lookup_by_site[kmcpy_site][kmcpy_state] -> external_occupation_value

The KMC loop then uses array indexing for event endpoints. It does not walk the mapping dictionaries for every proposed hop.

Build The External Site Mapping¶

Build site_mapping after the external code has finalized its site order. The dictionary direction is always:

site_mapping[kmcpy_active_site_index] = external_site_index

If the external object is built directly from kMCpy’s active-site structure and keeps the same order, no site mapping is needed:

active_structure = active_site_order.active_structure()
external_runtime = build_external_runtime(active_structure)

site_model = SiteEnergyModel(
    runtime=external_runtime,
    compute_fn=external_delta,
    state_mapping=kmcpy_state_to_external_value,
    units="eV",
)

If the external structure preserves site properties, use the _kmcpy_active_site_index property. This works for either active-only structures or full structures with fixed sites:

from kmcpy.structure.active_site_order import ACTIVE_SITE_PROPERTY


def site_mapping_from_property(external_structure, active_site_order):
    site_mapping = {}

    for external_index, site in enumerate(external_structure):
        active_index = site.properties.get(ACTIVE_SITE_PROPERTY)
        if active_index is None or int(active_index) < 0:
            continue
        site_mapping[int(active_index)] = int(external_index)

    if len(site_mapping) != active_site_order.active_site_count:
        raise ValueError("External structure does not contain every active site.")

    return site_mapping

If the external code drops site properties, match coordinates once before the KMC run:

import numpy as np


def site_mapping_from_coordinates(
    external_structure,
    active_site_order,
    tol=1e-3,
):
    active_structure = active_site_order.active_structure()
    site_mapping = {}
    used_external_sites = set()

    for active_index, active_site in enumerate(active_structure):
        distances = external_structure.lattice.get_all_distances(
            [active_site.frac_coords],
            external_structure.frac_coords,
        )[0]
        external_index = int(np.argmin(distances))

        if float(distances[external_index]) > tol:
            raise ValueError(f"No external site matches active site {active_index}.")
        if external_index in used_external_sites:
            raise ValueError("Two active sites map to the same external site.")

        site_mapping[active_index] = external_index
        used_external_sites.add(external_index)

    return site_mapping

Coordinate matching is a setup step, not a per-event operation. Store the resulting dictionary in the SiteEnergyModel; kMCpy converts it to an array in initialize_state(...).

Site-Order Traceability¶

kMCpy’s own compact active-site order is defined by ActiveSiteOrder, the same object used when structures, events, and occupations are loaded. When a SiteEnergyModel is initialized during a normal KMC run, kMCpy passes that map to the model. The model records:

active_site_order_hash: the ActiveSiteOrder.fingerprint for the kMCpy active-site order.
external_site_order_hash: a hash of the active-site to external-site mapping and external occupation size.

These hashes are serialized with the model so model files can be traced back to the exact active-site order and external mapping they were built against. If you construct the model manually, pass the index map explicitly:

site_model = SiteEnergyModel(
    runtime=external_evaluator,
    compute_fn=external_delta,
    site_mapping=kmcpy_to_external_site,
    state_mapping=kmcpy_state_to_external_value,
    active_site_order=active_site_order,
    units="eV",
)

Runtime Lifecycle¶

When mappings or an external runtime are used, SiteEnergyModel does three things:

At setup, it builds site/state lookup arrays and one external occupation array from the initial kMCpy occupation.
For a proposed event, it maps only the two endpoint changes and calls your compute_fn.
After an accepted event, it optionally calls your apply_fn, then updates only the two changed external occupation entries.

So the full occupation conversion happens once, not every event.

Smol-Style Example¶

For smol, keep the smol processor as the runtime object. The delta function turns kMCpy’s mapped endpoint changes into smol-style flips:

import numpy as np

from kmcpy.models import CompositeLCEModel, SiteEnergyModel


def smol_delta(runtime, external_occupation, changes, coefficients):
    flips = [change.as_flip() for change in changes]
    delta_features = runtime.compute_feature_vector_change(
        external_occupation,
        flips,
    )
    return float(np.dot(delta_features, coefficients))  # eV


site_model = SiteEnergyModel(
    runtime=smol_processor,
    compute_fn=smol_delta,
    compute_kwargs={"coefficients": smol_coefficients},
    site_mapping=kmcpy_to_smol_site,
    state_mapping=kmcpy_state_to_smol_code,
    external_dtype="int32",
    units="eV",
)

model = CompositeLCEModel(
    kra_model=kra_lce_model,
    site_model=site_model,
)

compute(...) does not mutate external_occupation. kMCpy updates the cached external occupation only after the event is accepted.

CLEASE-Style Example¶

For CLEASE or ASE, keep the evaluator as the runtime object. The delta function evaluates the proposed local changes; the apply function commits accepted changes to the external evaluator.

from kmcpy.models import CompositeLCEModel, SiteEnergyModel


def clease_delta(runtime, external_occupation, changes):
    system_changes = [
        make_system_change(
            change.external_site,
            change.old_value,
            change.new_value,
        )
        for change in changes
    ]
    return runtime.get_energy_given_change(system_changes) - runtime.get_energy()


def clease_apply(runtime, external_occupation, changes):
    system_changes = [
        make_system_change(
            change.external_site,
            change.old_value,
            change.new_value,
        )
        for change in changes
    ]
    runtime.apply_system_changes(system_changes, keep=True)


site_model = SiteEnergyModel(
    runtime=clease_evaluator,
    compute_fn=clease_delta,
    apply_fn=clease_apply,
    site_mapping=kmcpy_to_clease_site,
    state_mapping={0: "Li", 1: "X"},
    units="eV",
)

model = CompositeLCEModel(
    kra_model=kra_lce_model,
    site_model=site_model,
)

The external_occupation argument is still updated by kMCpy after accepted events. Use apply_fn only for external runtime state that kMCpy cannot update itself, such as a live evaluator or calculator.

Model Files¶

Live Python objects cannot be serialized into model files. For file-based workflows, provide factory references instead:

site_model = SiteEnergyModel(
    runtime_ref="my_project.smol_site_energy:build_processor",
    runtime_kwargs={"model_file": "smol_ce.json"},
    compute_ref="my_project.smol_site_energy:smol_delta",
    compute_kwargs={"coefficients_file": "eci.npy"},
    site_mapping=kmcpy_to_smol_site,
    state_mapping=kmcpy_state_to_smol_code,
    external_dtype="int32",
    units="eV",
)

site_model.to("site_energy.json")

At runtime, kMCpy resolves runtime_ref, compute_ref, and apply_ref from their import paths.

Direct Callable Example¶

For a kMCpy-native callable, omit site_mapping and state_mapping. The callable can accept only the arguments it needs.

from kmcpy.models import SiteEnergyModel

site_model = SiteEnergyModel(
    compute_ref="my_project.site_energy:site_energy_difference",
    units="eV",
    compute_kwargs={"model_file": "site_ce.json"},
)

The callable receives:

def site_energy_difference(event, simulation_state, model_file):
    return 0.04  # E_after - E_before, in eV

Do not use this pattern if the callable rebuilds a full smol or CLEASE occupation object every time it is called. Keep the external runtime in SiteEnergyModel and provide mappings instead.

No Site-Energy Term¶

If the KRA model already contains everything you need, omit the site model:

model = CompositeLCEModel(
    kra_model=kra_lce_model,
    site_model=None,
)

Checklist¶

Before running a simulation, check:

compute_fn returns E_after_hop - E_before_hop, not an absolute energy.
units matches the model output.
site_mapping covers every kMCpy active site.
state_mapping or state_mapping_by_site covers every state needed by the events.
compute_fn does not mutate the external occupation for proposed events.
apply_fn mutates only accepted events.
Full occupation and mapping conversion happens in initialize_state(...), not in compute(...).