Working with Time-Series Data
A unique feature of this toolbox, is that it has also been designed to work with time-series data.
For example, model instances can be downsampled across items, where items refers to time samples. You must specify the
sampling_freq of the data, and the
target, where target must have a
['hz','samples','seconds']. Downsampling is performed by averaging over bin windows. In this example we downsample a dataset from 10Hz to 5Hz.
It is also possible to leverage presumed autocorrelation when training models by using the
dilate_by_nsamples=n_samples keyword. This flag will convolve a boxcar kernel of width
n_samples with each user's rating from
model.train_mask. The dilation will be centered on each sample. The intuition here is that if a subject rates an item at a given time point, say '50', they likely will have rated time points immediately preceding and following similarly (e.g.,
[50,50,50]). This is due to autocorrelation in the data. More presumed autocorrelation will likely benefit from a higher number of samples being selected. This will allow time series that are sparsely sampled to be estimated more accurately.
cf = NNMF_sgd(ratings, n_mask_items=.5) cf.fit(n_iterations = 100, user_fact_reg=1.0, item_fact_reg=0.001, user_bias_reg=0, item_bias_reg=0, learning_rate=.001, dilate_ts_n_samples=20) cf.summary()