Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hessian could not be derived. Setting all entries to NA. #132

Closed
cvermehren opened this issue Jul 29, 2020 · 8 comments
Closed

Hessian could not be derived. Setting all entries to NA. #132

cvermehren opened this issue Jul 29, 2020 · 8 comments
Labels
FAQ Issue that should be listed in the FAQ use case General application question related to an user's use case.

Comments

@cvermehren
Copy link

cvermehren commented Jul 29, 2020

Thank you for this wonderful new package, which includes the much needed possibility of accounting for co-variates.

I have tried the function pnbd(), but it results in NA predictions. The warning message is:

Starting estimation...
Estimation finished!
Warning: Estimation failed with NA coefs. The returened object contains results but further usage is restricted.
Warning: Hessian could not be derived. Setting all entries to NA.

I have used a real transaction dataset. From this, I filtered the data to include only a cohort defined as all customers starting during the first two month of a year.

I have compared with BTYDPlus using the same dataset and experienced no problems.

Code:

cht.clv <- cht[, list(
  Id = cust,
  Date = date,
  Price = sales
  )]

cht.clv <- clvdata(
  cht.clv, 
  date.format = "ymd",
  time.unit = "w",
  estimation.split = 145
  
  )

summary(cht.clv)
 CLV Transaction Data 

 Time unit         Weeks         
 Estimation length 145.0000 Weeks
 Holdout length    145.5714 Weeks
 
 Transaction Data Summary 
 Estimation       Holdout          Total       
 Number of customers                -                -                3427        
 First Transaction in period        2015-01-01       2017-10-13       2015-01-01  
 Last Transaction in period         2017-10-11       2020-07-28       2020-07-28  
 Total -- Transactions               7157             2546             9703        
 Mean -- Transactions per cust       2.088            3.017            2.831       
 (SD)                               2.269            3.440            3.795       
 Mean Spending per Transaction      1010.948         1273.318         1079.792    
 (SD)                               747.756          954.229          815.222     
 Total Spending                     7235357.279      3241867.913      10477225.191
 Total -- zero repeaters             2048             2583             1789        
 Percentage -- zero repeaters        0.598            0.754            0.522       
 Mean Interpurchase time            35.256           28.087           60.268      
 (SD) 

est <- pnbd(cht.clv)
predict(est)

predict(params)
 Predicting from 2017-10-13 until (incl.) 2020-07-28 (145.71 Weeks).
    Id      period.first period.last period.length actual.x actual.spending PAlive CET DERT predicted.Spending predicted.CLV
 1: ***     2017-10-13  2020-07-28      145.7143        1           359.2     NA  NA   NA           594.9644            NA
 2: ***     2017-10-13  2020-07-28      145.7143        0             0.0    NaN  NA  NaN          1022.0748           NaN
 3: ***     2017-10-13  2020-07-28      145.7143        0             0.0     NA  NA   NA          1468.8352            NA
 4: ***     2017-10-13  2020-07-28      145.7143        1          1829.6    NaN  NA  NaN          1022.0748           NaN
 5: ***     2017-10-13  2020-07-28      145.7143        0             0.0     NA  NA   NA          1070.2460            NA
-- ---                                                                                                                           
 3423: ***  2017-10-13  2020-07-28      145.7143        0             0.0     NA  NA   NA          1022.0748            NA
 3424: ***  2017-10-13  2020-07-28      145.7143        0             0.0    NaN  NA  NaN          1022.0748           NaN
 3425: ***  2017-10-13  2020-07-28      145.7143        0             0.0     NA  NA   NA          1022.0748            NA
 3426: ***  2017-10-13  2020-07-28      145.7143        0             0.0    NaN  NA  NaN          1022.0748           NaN
 3427: ***  2017-10-13  2020-07-28      145.7143        2          2238.4     NA  NA   NA           897.1967            NA
@pschil
Copy link
Collaborator

pschil commented Jul 29, 2020

Hi

thanks for your feedback.

This issue is likely related to the optimization method that is used to maximize the likelihood function.
By default CLVTools currently uses method L-BFGS-B which will break, immediately stop, and return NA coefficients if non-finite values are returned from the likelihood function. Unfortunately, the log-likelihood function for pnbd requires the notorious hypergeometric 2F1 function which depending on the data and parameters will yield NaNs or Infs.

Hence, for a more stable optimization you likely want to use method Nelder-Mead instead which will not terminate if non-finite values are returned:

est <- pnbd(clv.data = cht.clv, optimx.args = list(method="Nelder-Mead"))

# or for more insights about the optimization
est <- pnbd(clv.data = cht.clv, optimx.args = list(method="Nelder-Mead", control=list(trace=6)))

See ?optimx for what other options are available and see also other examples in ?pnbd.

See also this previous issue about the coefficients being NA where some more details are provided: #112.

The error therefore is not really about the hessian that could not be derived but in general about the estimation failing. In the next release, it will no longer be possible to predict() and plot() if there are NA in the coefficients because it it is obviously rather confusing.
Feel free to already work with the latest version currently on the development branch as it also contains other (exciting!) new features such as fitting the Gamma-Gamma (gg()) spending model separately:

devtools::install_github(repo = "bachmannpatrick/CLVTools", ref = "development")

Note that it requires the GSL library in order to compile successfully from source.

Further, I would in general recommend to also investigate the model fit with summary(est) and with plot(est) before proceeding with predicting.

Another minor thing I just saw in your code:

cht.clv <- cht[, list(Id = cust, Date = date, Price = sales)]
cht.clv <- clvdata(cht.clv, date.format = "ymd", time.unit = "w", estimation.split = 145)

can be done with

cht.clv <- clvdata(cht.clv, date.format = "ymd", time.unit = "w", estimation.split = 145, name.id = "cust", name.date="date", name.price = "sales")

Patrik

@cvermehren
Copy link
Author

cvermehren commented Jul 29, 2020 via email

@cvermehren
Copy link
Author

cvermehren commented Jul 29, 2020

Hi Patrik,

Here is some more feedback. I first ran the following code:

est <- pnbd(clv.data = cht.clv, optimx.args = list(method="Nelder-Mead"))
summary(est)

Result:

Pareto NBD Standard  Model 

Call:
pnbd(clv.data = cht.clv, optimx.args = list(method = "Nelder-Mead"))

Fitting period:                                
Estimation start  2015-01-01    
Estimation end    2017-10-12    
Estimation length 145.0000 Weeks

Coefficients:
       Estimate Std. Error z-val Pr(>|z|)
r        0.5329         NA    NA       NA
alpha   38.0598         NA    NA       NA
s       64.2792         NA    NA       NA
beta  7721.5264         NA    NA       NA

Optimization info:                  
LL     -16176.2972
AIC    32360.5945 
BIC    32384.1955 
KKT 1  NA         
KKT 2  NA         
fevals 597.0000   
Method Nelder-Mead

Used Options:                 
Correlation FALSE
Warning message:
For some parameters the standard error could not be calculated. 

Then this code:
predict(est)

Result:

  Id period.first period.last period.length actual.x actual.spending    PAlive        CET DERT predicted.Spending
   1: ***   2017-10-13  2020-07-28      145.7143        1         359.200 0.2918841 0.34676760  Inf           591.6977
   2: ***   2017-10-13  2020-07-28      145.7143        0           0.000 0.2241285 0.05602038  Inf          1002.0443
   3: ***   2017-10-13  2020-07-28      145.7143        0           0.000 0.1439510 0.17330498  Inf          1459.1719
   4: ***   2017-10-13  2020-07-28      145.7143        0           0.000 0.5490419 1.98142399  Inf          1067.7495
   5: ***   2017-10-13  2020-07-28      145.7143        4        3103.074 0.8837950 1.86889894  Inf           725.0541

2694: ***   2017-10-13  2020-07-28      145.7143        0           0.000 0.2418898 0.06295234  Inf          1002.0443
2695: ***   2017-10-13  2020-07-28      145.7143        0           0.000 0.1181469 0.08844852  Inf           897.3543
2696: ***   2017-10-13  2020-07-28      145.7143        0           0.000 0.1980979 0.14830231  Inf           880.8387
2697: ***   2017-10-13  2020-07-28      145.7143        1         554.280 0.3549185 0.61237296  Inf           673.1570
2698: ***   2017-10-13  2020-07-28      145.7143        0           0.000 0.2418898 0.06295234  Inf          1002.0443
      predicted.CLV
   1:           Inf
   2:           Inf
   3:           Inf
   4:           Inf
   5:           Inf

2694:           Inf
2695:           Inf
2696:           Inf
2697:           Inf
2698:           Inf

What is puzzling is that it was possible to estimate p alive and both the average spending and CET, but not CLV which should be simply a multiplication of spending and CET, right?

Thanks
Christian

@cvermehren
Copy link
Author

For you information, here is a plot of est - plot(est):

image

@mmeierer
Copy link
Collaborator

mmeierer commented Jul 29, 2020

Thanks for sharing. The fit is not too bad for the standard Pareto/NBD model.

Looks like you could really benefit from controlling for seasonality ;-)

Be aware that the estimation with time-varying covariates will take considerable more time than the estimation without covariates. Executing the code in the walkthrough (https://www.clvtools.com/articles/CLVTools.html) gives you a good overview on this.

Are you able to share about which industry we are talking here?

@cvermehren
Copy link
Author

Hi Patrik,

If you send me an email to [email protected], then I will share some info about the dataset.

Christian

@pschil
Copy link
Collaborator

pschil commented Jul 29, 2020

but not CLV which should be simply a multiplication of spending and CET, right?

Not quite. CLV is the multiplication of DERT (Discounted Expected Residual Transactions) * mean spending. Or more formally, you assume spending to be constant and can pull it out the integral which then is the DERT (=transactions * survival * discount rate, from T to Infinity):

See also ?predict.clv.fitted for more information about the prediction output (will be accessible through ?predict in the next version).

From the summary() output the beta coefficient seems very high with 7721 which leads to DERT = Inf, likely due to the hypergeometric 1F1 function involved in the pnbd's DERT expression.

In case your customers are currently from different industries or business segments you might want to try to break them up and fit the model separately in order to obtain more helpful parameter estimates. Also, removing customers that only bought once during the calibration period ("zero-repeaters") may yield better estimates.

@cvermehren
Copy link
Author

I also suspect that there are in fact different customer segments involved, but it isn't clear from the data.

Thanks for the general tips!

@mmeierer mmeierer closed this as completed Aug 2, 2020
@mmeierer mmeierer added the use case General application question related to an user's use case. label Oct 5, 2020
@mmeierer mmeierer added the FAQ Issue that should be listed in the FAQ label Apr 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FAQ Issue that should be listed in the FAQ use case General application question related to an user's use case.
Projects
None yet
Development

No branches or pull requests

3 participants