Choose Basis Functions¶
Basis functions convert occupation states into numerical features for local cluster expansion.
Occupation State Indices¶
kMCpy stores active-site occupations as state indices. For a site with allowed species:
["Na", "X"]
state 0 means Na and state 1 means vacancy X.
For:
["Si", "P", "Al"]
state 0 means Si, state 1 means P, and state 2 means Al.
The allowed species order in site_mapping therefore matters.
Chebyshev Basis¶
For a site with q allowed species, kMCpy uses q - 1 non-constant Chebyshev
site functions. A cluster feature is a product of the selected site functions
over the sites in that cluster.
This means multicomponent sites naturally create more decorated features:
Site States |
Non-Constant Site Functions |
|---|---|
2 |
1 |
3 |
2 |
4 |
3 |
A pair of two four-state sites can therefore contribute up to 3 x 3
decorated pair features.
Cost¶
More species per site increases the number of decorated cluster features. The largest cost usually appears during correlation-matrix construction and fitting, not during the accepted-hop update itself.
Keep the local cutoff and cluster cutoffs physically motivated. A larger basis is useful only if the training data can constrain it.
Practical Rules¶
Use
basis_type="chebyshev"for multicomponent active sites.Keep the same
site_mappingspecies order throughout fitting and kMC.Refit if you change the basis, local site order, cutoff, or allowed species.
Check the correlation matrix shape before fitting.