.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/pcovr/PCovR_Regressors.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_pcovr_PCovR_Regressors.py: Choosing Different Regressors for PCovR ======================================= .. GENERATED FROM PYTHON SOURCE LINES 10-20 .. code-block:: Python import time from matplotlib import pyplot as plt from sklearn.datasets import load_diabetes from sklearn.linear_model import Ridge from sklearn.preprocessing import StandardScaler from skmatter.decomposition import PCovR .. GENERATED FROM PYTHON SOURCE LINES 21-23 For this, we will use the :func:`sklearn.datasets.load_diabetes` dataset from ``sklearn``. .. GENERATED FROM PYTHON SOURCE LINES 24-36 .. code-block:: Python mixing = 0.5 X, y = load_diabetes(return_X_y=True) X_scaler = StandardScaler() X_scaled = X_scaler.fit_transform(X) y_scaler = StandardScaler() y_scaled = y_scaler.fit_transform(y.reshape(-1, 1)) .. GENERATED FROM PYTHON SOURCE LINES 37-42 Use the default regressor in PCovR ---------------------------------- When there is no regressor supplied, PCovR uses ``sklearn.linear_model.Ridge('alpha':1e-6, 'fit_intercept':False, 'tol':1e-12)``. .. GENERATED FROM PYTHON SOURCE LINES 43-53 .. code-block:: Python pcovr1 = PCovR(mixing=mixing, n_components=2) t0 = time.perf_counter() pcovr1.fit(X_scaled, y_scaled) t1 = time.perf_counter() print(f"Regressor is {pcovr1.regressor_} and fit took {1e3 * (t1 - t0):0.2} ms.") .. rst-class:: sphx-glr-script-out .. code-block:: none Regressor is Ridge(alpha=1e-06, fit_intercept=False, tol=1e-12) and fit took 2.3 ms. .. GENERATED FROM PYTHON SOURCE LINES 54-62 Use a fitted regressor ---------------------- You can pass a fitted regressor to ``PCovR`` to rely on the predetermined regression parameters. Currently, scikit-matter supports ``scikit-learn`` classes class:`LinearModel `, :class:`Ridge `, and class:`RidgeCV `, with plans to support any regressor with similar architecture in the future. .. GENERATED FROM PYTHON SOURCE LINES 63-73 .. code-block:: Python regressor = Ridge(alpha=1e-6, fit_intercept=False, tol=1e-12) t0 = time.perf_counter() regressor.fit(X_scaled, y_scaled) t1 = time.perf_counter() print(f"Fit took {1e3 * (t1 - t0):0.2} ms.") .. rst-class:: sphx-glr-script-out .. code-block:: none Fit took 0.61 ms. .. GENERATED FROM PYTHON SOURCE LINES 75-84 .. code-block:: Python pcovr2 = PCovR(mixing=mixing, n_components=2, regressor=regressor) t0 = time.perf_counter() pcovr2.fit(X_scaled, y_scaled) t1 = time.perf_counter() print(f"Regressor is {pcovr2.regressor_} and fit took {1e3 * (t1 - t0):0.2} ms.") .. rst-class:: sphx-glr-script-out .. code-block:: none Regressor is Ridge(alpha=1e-06, fit_intercept=False, tol=1e-12) and fit took 0.99 ms. .. GENERATED FROM PYTHON SOURCE LINES 85-92 Use a pre-predicted y --------------------- With ``regressor='precomputed'``, you can pass a regression output :math:`\hat{Y}` and optional regression weights :math:`W` to PCovR. If ``W=None``, then PCovR will determine :math:`W` as the least-squares solution between :math:`X` and :math:`\hat{Y}`. .. GENERATED FROM PYTHON SOURCE LINES 93-104 .. code-block:: Python regressor = Ridge(alpha=1e-6, fit_intercept=False, tol=1e-12) t0 = time.perf_counter() regressor.fit(X_scaled, y_scaled) t1 = time.perf_counter() print(f"Fit took {1e3 * (t1 - t0):0.2} ms.") W = regressor.coef_ .. rst-class:: sphx-glr-script-out .. code-block:: none Fit took 0.55 ms. .. GENERATED FROM PYTHON SOURCE LINES 106-115 .. code-block:: Python pcovr3 = PCovR(mixing=mixing, n_components=2, regressor="precomputed") t0 = time.perf_counter() pcovr3.fit(X_scaled, y_scaled, W=W) t1 = time.perf_counter() print(f"Fit took {1e3 * (t1 - t0):0.2} ms.") .. rst-class:: sphx-glr-script-out .. code-block:: none Fit took 0.64 ms. .. GENERATED FROM PYTHON SOURCE LINES 116-121 Comparing Results ----------------- Because we used the same regressor in all three models, they will yield the same result. .. GENERATED FROM PYTHON SOURCE LINES 122-140 .. code-block:: Python fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(12, 4), sharex=True, sharey=True) ax1.scatter(*pcovr1.transform(X_scaled).T, c=y) ax2.scatter(*pcovr2.transform(X_scaled).T, c=y) ax3.scatter(*pcovr3.transform(X_scaled).T, c=y) ax1.set_ylabel("PCov$_2$") ax1.set_xlabel("PCov$_1$") ax2.set_xlabel("PCov$_1$") ax3.set_xlabel("PCov$_1$") ax1.set_title("Default Regressor") ax2.set_title("Pre-fit Regressor") ax3.set_title("Precomputed Regression Result") fig.show() .. image-sg:: /examples/pcovr/images/sphx_glr_PCovR_Regressors_001.png :alt: Default Regressor, Pre-fit Regressor, Precomputed Regression Result :srcset: /examples/pcovr/images/sphx_glr_PCovR_Regressors_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 141-143 As you can imagine, these three options have different use cases -- if you are working with a large dataset, you should always pre-fit to save on time! .. _sphx_glr_download_examples_pcovr_PCovR_Regressors.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: PCovR_Regressors.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: PCovR_Regressors.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: PCovR_Regressors.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_