zoo.models.recommendation package

Submodules

zoo.models.recommendation.neuralcf module

class zoo.models.recommendation.neuralcf.NeuralCF(user_count, item_count, class_num, user_embed=20, item_embed=20, hidden_layers=[40, 20, 10], include_mf=True, mf_embed=20, bigdl_type='float')[source]

Bases: zoo.models.recommendation.recommender.Recommender

The neural collaborative filtering model used for recommendation.

# Arguments user_count: The number of users. Positive int. item_count: The number of classes. Positive int. class_num: The number of classes. Positive int. user_embed: Units of user embedding. Positive int. Default is 20. item_embed: itemEmbed Units of item embedding. Positive int. Default is 20. hidden_layers: Units of hidden layers for MLP. Tuple of positive int. Default is (40, 20, 10). include_mf: Whether to include Matrix Factorization. Boolean. Default is True. mf_embed: Units of matrix factorization embedding. Positive int. Default is 20.

build_model()[source]
static load_model(path, weight_path=None, bigdl_type='float')[source]

Load an existing NeuralCF model (with weights).

# Arguments path: The path for the pre-defined model.

Local file system, HDFS and Amazon S3 are supported. HDFS path should be like ‘hdfs://[host]:[port]/xxx’. Amazon S3 path should be like ‘s3a://bucket/xxx’.

weight_path: The path for pre-trained weights if any. Default is None.

zoo.models.recommendation.recommender module

class zoo.models.recommendation.recommender.Recommender(jvalue, bigdl_type, *args)[source]

Bases: zoo.models.common.zoo_model.KerasZooModel

The base class for recommendation models in Analytics Zoo.

predict_user_item_pair(feature_rdd)[source]

Predict for user-item pairs.

# Arguments feature_rdd: RDD of UserItemFeature. :return RDD of UserItemPrediction.

recommend_for_item(feature_rdd, max_users)[source]

Recommend a number of users for each item.

# Arguments feature_rdd: RDD of UserItemFeature. max_users: The number of users to be recommended to each item. Positive int. :return RDD of UserItemPrediction.

recommend_for_user(feature_rdd, max_items)[source]

Recommend a number of items for each user.

# Arguments feature_rdd: RDD of UserItemFeature. max_items: The number of items to be recommended to each user. Positive int. :return RDD of UserItemPrediction.

class zoo.models.recommendation.recommender.UserItemFeature(user_id, item_id, sample, bigdl_type='float')[source]

Bases: object

Represent records of user-item with features.

Each record should contain the following fields: user_id: Positive int. item_id: Positive int. sample: Sample which consists of feature(s) and label(s).

class zoo.models.recommendation.recommender.UserItemPrediction(user_id, item_id, prediction, probability, bigdl_type='float')[source]

Bases: object

Represent the prediction results of user-item pairs.

Each prediction record will contain the following information: user_id: Positive int. item_id: Positive int. prediction: The prediction (rating) for the user on the item. probability: The probability for the prediction.

zoo.models.recommendation.session_recommender module

class zoo.models.recommendation.session_recommender.SessionRecommender(item_count, item_embed, rnn_hidden_layers=[40, 20], session_length=0, include_history=False, mlp_hidden_layers=[40, 20], history_length=0, bigdl_type='float')[source]

Bases: zoo.models.recommendation.recommender.Recommender

The Session Recommender model used for recommendation.

# Arguments
item_ount: The number of distinct items. Positive integer. item_embed: The output size of embedding layer. Positive integer. rnn_hidden_layers: Units of hidden layers for the mlp model. Array of positive integers. session_length: The max number of items in the sequence of a session include_history: Whether to include purchase history. Boolean. Default is true. mlp_hidden_layers: Units of hidden layers for the mlp model. Array of positive integers. history_length: The max number of items in the sequence of historical purchase
build_model()[source]
static load_model(path, weight_path=None, bigdl_type='float')[source]

Load an existing SessionRecommender model (with weights).

# Arguments path: The path for the pre-defined model.

Local file system, HDFS and Amazon S3 are supported. HDFS path should be like ‘hdfs://[host]:[port]/xxx’. Amazon S3 path should be like ‘s3a://bucket/xxx’.

weight_path: The path for pre-trained weights if any. Default is None.

predict_user_item_pair(feature_rdd)[source]

Predict for user-item pairs.

# Arguments feature_rdd: RDD of UserItemFeature. :return RDD of UserItemPrediction.

recommend_for_item(feature_rdd, max_users)[source]

Recommend a number of users for each item.

# Arguments feature_rdd: RDD of UserItemFeature. max_users: The number of users to be recommended to each item. Positive int. :return RDD of UserItemPrediction.

recommend_for_session(sessions, max_items, zero_based_label)[source]

recommend for sessions given rdd of samples or list of samples.

# Arguments sessions: rdd of samples or list of samples. max_items: Number of items to be recommended to each user. Positive integer. zero_based_label: True if data starts from 0, False if data starts from 1 :return rdd of list of list(item, probability),

recommend_for_user(feature_rdd, max_items)[source]

Recommend a number of items for each user.

# Arguments feature_rdd: RDD of UserItemFeature. max_items: The number of items to be recommended to each user. Positive int. :return RDD of UserItemPrediction.

zoo.models.recommendation.txt module

zoo.models.recommendation.utils module

zoo.models.recommendation.utils.categorical_from_vocab_list(sth, vocab_list, default=-1, start=0)[source]
zoo.models.recommendation.utils.get_boundaries(target, boundaries, default=-1, start=0)[source]
zoo.models.recommendation.utils.get_deep_tensors(row, column_info)[source]

convert a row to tensors given column feature information of a WideAndDeep model

Parameters:
  • row – Row of userId, itemId, features and label
  • column_info – ColumnFeatureInfo specify information of different features
Returns:

an array of tensors as input for deep part of a WideAndDeep model

zoo.models.recommendation.utils.get_negative_samples(indexed)[source]
zoo.models.recommendation.utils.get_wide_tensor(row, column_info)[source]

prepare tensor for wide part of WideAndDeep model based on SparseDense

Parameters:
  • row – Row of userId, itemId, features and label
  • column_info – ColumnFeatureInfo specify information of different features
Returns:

an array of tensors as input for wide part of a WideAndDeep model

zoo.models.recommendation.utils.hash_bucket(content, bucket_size=1000, start=0)[source]
zoo.models.recommendation.utils.row_to_sample(row, column_info, model_type='wide_n_deep')[source]

convert a row to sample given column feature information of a WideAndDeep model

Parameters:
  • row – Row of userId, itemId, features and label
  • column_info – ColumnFeatureInfo specify information of different features
Returns:

TensorSample as input for WideAndDeep model

zoo.models.recommendation.utils.to_user_item_feature(row, column_info, model_type='wide_n_deep')[source]

convert a row to UserItemFeature given column feature information of a WideAndDeep model

Parameters:
  • row – Row of userId, itemId, features and label
  • column_info – ColumnFeatureInfo specify information of different features
Returns:

UserItemFeature for recommender model

zoo.models.recommendation.wide_and_deep module

class zoo.models.recommendation.wide_and_deep.ColumnFeatureInfo(wide_base_cols=None, wide_base_dims=None, wide_cross_cols=None, wide_cross_dims=None, indicator_cols=None, indicator_dims=None, embed_cols=None, embed_in_dims=None, embed_out_dims=None, continuous_cols=None, label='label', bigdl_type='float')[source]

Bases: object

The same data information shared by the WideAndDeep model and its feature generation part.

Each instance could contain the following fields: wide_base_cols: Data of wide_base_cols together with wide_cross_cols will be fed

into the wide model. List of String. Default is an empty list.
wide_base_dims: Dimensions of wide_base_cols. The dimensions of the data in
wide_base_cols should be within the range of wide_base_dims. List of int. Default is an empty list.
wide_cross_cols: Data of wide_cross_cols will be fed into the wide model.
List of String. Default is an empty list.
wide_cross_dims: Dimensions of wide_cross_cols. The dimensions of the data in
wide_cross_cols should be within the range of wide_cross_dims. List of int. Default is an empty list.
indicator_cols: Data of indicator_cols will be fed into the deep model as multi-hot vectors.
List of String. Default is an empty list.
indicator_dims: Dimensions of indicator_cols. The dimensions of the data in
indicator_cols should be within the range of indicator_dims. List of int. Default is an empty list.
embed_cols: Data of embed_cols will be fed into the deep model as embeddings.
List of String. Default is an empty list.
embed_in_dims: Input dimension of the data in embed_cols. The dimensions of the data in
embed_cols should be within the range of embed_in_dims. List of int. Default is an empty list.

embed_out_dims: The dimensions of embeddings. List of int. Default is an empty list. continuous_cols: Data of continuous_cols will be treated as continuous values for

the deep model. List of String. Default is an empty list.

label: The name of the ‘label’ column. String. Default is ‘label’.

class zoo.models.recommendation.wide_and_deep.WideAndDeep(class_num, column_info, model_type='wide_n_deep', hidden_layers=[40, 20, 10], bigdl_type='float')[source]

Bases: zoo.models.recommendation.recommender.Recommender

The Wide and Deep model used for recommendation.

# Arguments class_num: The number of classes. Positive int. column_info: An instance of ColumnFeatureInfo. model_type: String. ‘wide’, ‘deep’ and ‘wide_n_deep’ are supported. Default is ‘wide_n_deep’. hidden_layers: Units of hidden layers for the deep model.

Tuple of positive int. Default is (40, 20, 10).
build_model()[source]
static load_model(path, weight_path=None, bigdl_type='float')[source]

Load an existing WideAndDeep model (with weights).

# Arguments path: The path for the pre-defined model.

Local file system, HDFS and Amazon S3 are supported. HDFS path should be like ‘hdfs://[host]:[port]/xxx’. Amazon S3 path should be like ‘s3a://bucket/xxx’.

weight_path: The path for pre-trained weights if any. Default is None.

Module contents