sklearn_api.lsimodel – Scikit learn wrapper for Latent Semantic Indexing¶Scikit learn interface for gensim for easy use of gensim with scikit-learn Follows scikit-learn API conventions
gensim.sklearn_api.lsimodel.LsiTransformer(num_topics=200, id2word=None, chunksize=20000, decay=1.0, onepass=True, power_iters=2, extra_samples=100)¶Bases: sklearn.base.TransformerMixin, sklearn.base.BaseEstimator
Base LSI module
Sklearn wrapper for LSI model. See gensim.model.LsiModel for parameter details.
fit(X, y=None)¶Fit the model according to the given training data. Calls gensim.models.LsiModel
fit_transform(X, y=None, **fit_params)¶Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
| Parameters: |
|
|---|---|
| Returns: | X_new – Transformed array. |
| Return type: | numpy array of shape [n_samples, n_features_new] |
get_params(deep=True)¶Get parameters for this estimator.
| Parameters: | deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators. |
|---|---|
| Returns: | params – Parameter names mapped to their values. |
| Return type: | mapping of string to any |
partial_fit(X)¶Train model over X.
set_params(**params)¶Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as pipelines). The latter have parameters of the form
<component>__<parameter> so that it’s possible to update each
component of a nested object.
| Returns: | |
|---|---|
| Return type: | self |
transform(docs)¶Takes a list of documents as input (‘docs’). Returns a matrix of topic distribution for the given document bow, where a_ij indicates (topic_i, topic_probability_j). The input docs should be in BOW format and can be a list of documents like : [ [(4, 1), (7, 1)], [(9, 1), (13, 1)], [(2, 1), (6, 1)] ] or a single document like : [(4, 1), (7, 1)]