Module gmr.sklearn

Classes

class GaussianMixtureRegressor (n_components,
priors=None,
means=None,
covariances=None,
verbose=0,
random_state=None,
R_diff=0.0001,
n_iter=500,
init_params='random')
Expand source code
class GaussianMixtureRegressor(MultiOutputMixin, RegressorMixin, BaseEstimator):
    """Gaussian mixture regression compatible to scikit-learn.

    Parameters
    ----------
    n_components : int
        Number of MVNs that compose the GMM.

    priors : array, shape (n_components,), optional
        Weights of the components.

    means : array, shape (n_components, n_features), optional
        Means of the components.

    covariances : array, shape (n_components, n_features, n_features), optional
        Covariances of the components.

    verbose : int, optional (default: 0)
        Verbosity level.

    random_state : int or RandomState, optional (default: global random state)
        If an integer is given, it fixes the seed. Defaults to the global numpy
        random number generator.

    R_diff : float, optional (default: 1e-4)
        Minimum allowed difference of responsibilities between successive
        EM iterations.

    n_iter : int, optional (default: 500)
        Maximum number of iterations.

    init_params : str, optional (default: 'random')
        Parameter initialization strategy. If means and covariances are
        given in the constructor, this parameter will have no effect.
        'random' will sample initial means randomly from the dataset
        and set covariances to identity matrices. This is the
        computationally cheap solution.
        'kmeans++' will use k-means++ initialization for means and
        initialize covariances to diagonal matrices with variances
        set based on the average distances of samples in each dimensions.
        This is computationally more expensive but often gives much
        better results.

    Attributes
    ----------
    gmm_ : GMM
        Underlying GMM object

    indices_ : array, shape (n_features,)
        Indices of inputs
    """

    def __init__(self, n_components, priors=None, means=None, covariances=None,
                 verbose=0, random_state=None, R_diff=1e-4, n_iter=500,
                 init_params="random"):
        self.n_components = n_components
        self.priors = priors
        self.means = means
        self.covariances = covariances
        self.verbose = verbose
        self.random_state = random_state
        self.R_diff = R_diff
        self.n_iter = n_iter
        self.init_params = init_params

    def fit(self, X, y):
        self.gmm_ = GMM(
            self.n_components, priors=self.priors, means=self.means,
            covariances=self.covariances, verbose=self.verbose,
            random_state=self.random_state)

        X, y = check_X_y(X, y, estimator=self.gmm_, dtype=FLOAT_DTYPES,
                         multi_output=True)
        if y.ndim == 1:
            y = np.expand_dims(y, 1)

        self.indices_ = np.arange(X.shape[1])

        self.gmm_.from_samples(
            np.hstack((X, y)), R_diff=self.R_diff, n_iter=self.n_iter,
            init_params=self.init_params)
        return self

    def predict(self, X):
        check_is_fitted(self, ["gmm_", "indices_"])
        X = check_array(X, estimator=self.gmm_, dtype=FLOAT_DTYPES)

        return self.gmm_.predict(self.indices_, X)

Gaussian mixture regression compatible to scikit-learn.

Parameters

n_components : int
Number of MVNs that compose the GMM.
priors : array, shape (n_components,), optional
Weights of the components.
means : array, shape (n_components, n_features), optional
Means of the components.
covariances : array, shape (n_components, n_features, n_features), optional
Covariances of the components.
verbose : int, optional (default: 0)
Verbosity level.
random_state : int or RandomState, optional (default: global random state)
If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.
R_diff : float, optional (default: 1e-4)
Minimum allowed difference of responsibilities between successive EM iterations.
n_iter : int, optional (default: 500)
Maximum number of iterations.
init_params : str, optional (default: 'random')
Parameter initialization strategy. If means and covariances are given in the constructor, this parameter will have no effect. 'random' will sample initial means randomly from the dataset and set covariances to identity matrices. This is the computationally cheap solution. 'kmeans++' will use k-means++ initialization for means and initialize covariances to diagonal matrices with variances set based on the average distances of samples in each dimensions. This is computationally more expensive but often gives much better results.

Attributes

gmm_ : GMM
Underlying GMM object
indices_ : array, shape (n_features,)
Indices of inputs

Ancestors

  • sklearn.base.MultiOutputMixin
  • sklearn.base.RegressorMixin
  • sklearn.base.BaseEstimator
  • sklearn.utils._estimator_html_repr._HTMLDocumentationLinkMixin
  • sklearn.utils._metadata_requests._MetadataRequester

Methods

def fit(self, X, y)
Expand source code
def fit(self, X, y):
    self.gmm_ = GMM(
        self.n_components, priors=self.priors, means=self.means,
        covariances=self.covariances, verbose=self.verbose,
        random_state=self.random_state)

    X, y = check_X_y(X, y, estimator=self.gmm_, dtype=FLOAT_DTYPES,
                     multi_output=True)
    if y.ndim == 1:
        y = np.expand_dims(y, 1)

    self.indices_ = np.arange(X.shape[1])

    self.gmm_.from_samples(
        np.hstack((X, y)), R_diff=self.R_diff, n_iter=self.n_iter,
        init_params=self.init_params)
    return self
def predict(self, X)
Expand source code
def predict(self, X):
    check_is_fitted(self, ["gmm_", "indices_"])
    X = check_array(X, estimator=self.gmm_, dtype=FLOAT_DTYPES)

    return self.gmm_.predict(self.indices_, X)
def set_score_request(self: GaussianMixtureRegressor,
*,
sample_weight: bool | str | None = '$UNCHANGED$') ‑> GaussianMixtureRegressor
Expand source code
def func(*args, **kw):
    """Updates the request for provided parameters

    This docstring is overwritten below.
    See REQUESTER_DOC for expected functionality
    """
    if not _routing_enabled():
        raise RuntimeError(
            "This method is only available when metadata routing is enabled."
            " You can enable it using"
            " sklearn.set_config(enable_metadata_routing=True)."
        )

    if self.validate_keys and (set(kw) - set(self.keys)):
        raise TypeError(
            f"Unexpected args: {set(kw) - set(self.keys)} in {self.name}. "
            f"Accepted arguments are: {set(self.keys)}"
        )

    # This makes it possible to use the decorated method as an unbound method,
    # for instance when monkeypatching.
    # https://github.com/scikit-learn/scikit-learn/issues/28632
    if instance is None:
        _instance = args[0]
        args = args[1:]
    else:
        _instance = instance

    # Replicating python's behavior when positional args are given other than
    # `self`, and `self` is only allowed if this method is unbound.
    if args:
        raise TypeError(
            f"set_{self.name}_request() takes 0 positional argument but"
            f" {len(args)} were given"
        )

    requests = _instance._get_metadata_request()
    method_metadata_request = getattr(requests, self.name)

    for prop, alias in kw.items():
        if alias is not UNCHANGED:
            method_metadata_request.add_request(param=prop, alias=alias)
    _instance._metadata_request = requests

    return _instance

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see :func:sklearn.set_config). Please see :ref:User Guide <metadata_routing> on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version: 1.3

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a :class:~sklearn.pipeline.Pipeline. Otherwise it has no effect.

Parameters

sample_weight : str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for sample_weight parameter in score.

Returns

self : object
The updated object.