ITAM-CONAC MÉTODOS ESTADÍSTICOS EN ACTUARÍA I DR. …€¦ · itam-conac mÉtodos estadÍsticos...

Post on 07-Jun-2020

0 views 0 download

Transcript of ITAM-CONAC MÉTODOS ESTADÍSTICOS EN ACTUARÍA I DR. …€¦ · itam-conac mÉtodos estadÍsticos...

ITAM-CONAC

MÉTODOS ESTADÍSTICOS EN ACTUARÍA I

DR. JUAN JOSÉ FERNÁNDEZ DURÁN

EJEMPLO DE REGRESIÓN POISSON Y GAMMA

RECLAMACIONES EN SEGUROS DE AUTOS EN SUECIA

1)DESCRIPCIÓN DE LOS DATOS

Los datos contienen información sobre el número de reclamaciones y montos pagados por dichas reclamaciones reportados por un grupo de 2182 asegurados en Suecia en 1977. Las variables son las siguientes:

1. Kilometres: kilómetros recorridos promedio por año 1=menos de 1000, 2=de 1000 a 15000, 3=de 15000 a 20000, 4=20000 a 25000, 5=más de 25000 (Cual. Ordinal)

2. Zone: zona geográfica 1=Estocolmo, Gotemburgo, Malmo y sus alrededores, 2=Otras ciudades importantes y sus alrededores, 3=Ciudades pequeñas en el sur de Suecia, 4=Áreas rurales en el sur de Suecia, 5=Ciudades pequeñas en el norte de Suecia, 6=Áreas rurales en el norte de Suecia y 7=Gotland (provincia) (Cual. Nominal)

3. Bonus: Número de años más uno desde la última reclamación (Cuant. Discreta)

4. Make: 9 categorías de modelos (Cual. Nominal)

5. Insured: Número de expuestos en años-póliza (Cuant. Continua)

6. Claims: Número de reclamaciones (Cuant. Discreta)

7. Payment: Valor total de los pagos hechos por las reclamaciones en coronas suecas (Cuant. Continua).

2)BASE DE DATOS

Kilometres Zone Bonus Make Insured Claims Payment

1 1 1 1 1 455.13 108 392491

2 1 1 1 2 69.17 19 46221

3 1 1 1 3 72.88 13 15694

4 1 1 1 4 1292.39 124 422201

5 1 1 1 5 191.01 40 119373

6 1 1 1 6 477.66 57 170913

3)ANÁLISIS EXPLORATORIO DE DATOS

Kilometres

1 2 3 4 5

439 441 441 434 427

Zone

1 2 3 4 5 6 7

315 315 315 315 313 315 294

Bonus

Min. 1st Qu. Median Mean 3rd Qu. Max.

1.000 2.000 4.000 4.015 6.000 7.000

Make

1 2 3 4 5 6 7 8 9

245 245 242 238 244 244 242 237 245

Insured

Min. 1st Qu. Median Mean 3rd Qu. Max.

0.01 21.61 81.53 1092.00 389.80 127700.00

Claims

Min. 1st Qu. Median Mean 3rd Qu. Max.

0.00 1.00 5.00 51.87 21.00 3338.00

Payment

Min. 1st Qu. Median Mean 3rd Qu. Max.

0 2989 27400 257000 112000 18250000

4)MODELO DE REGRESIÓN POISSON: FRECUENCIA

OFFSET log(Insured)

Coefficients:

Estimate Std. Error z value Pr(>|z|)

(Intercept) -1.214998 0.026158 -46.448 < 2e-16 ***

Zone2 -0.346817 0.020823 -16.655 < 2e-16 ***

Zone3 -0.500711 0.021429 -23.366 < 2e-16 ***

Zone4 -0.728748 0.019300 -37.759 < 2e-16 ***

Zone5 -0.516962 0.033035 -15.649 < 2e-16 ***

Zone6 -0.707809 0.026944 -26.270 < 2e-16 ***

Zone7 -0.925981 0.093840 -9.868 < 2e-16 ***

Make2 -0.142938 0.060812 -2.350 0.018749 *

Make3 -0.295632 0.076515 -3.864 0.000112 ***

Make4 -1.092987 0.045508 -24.017 < 2e-16 ***

Make5 0.148346 0.046943 3.160 0.001577 **

Make6 -0.647975 0.039617 -16.356 < 2e-16 ***

Make7 -0.353962 0.063686 -5.558 2.73e-08 ***

Make8 -0.058195 0.097342 -0.598 0.549946

Make9 -0.375614 0.024274 -15.474 < 2e-16 ***

Bonus -0.259094 0.004783 -54.174 < 2e-16 ***

Zone2:Bonus 0.023908 0.003973 6.017 1.77e-09 ***

Zone3:Bonus 0.024206 0.004057 5.967 2.41e-09 ***

Zone4:Bonus 0.031791 0.003633 8.751 < 2e-16 ***

Zone5:Bonus 0.038044 0.006168 6.168 6.90e-10 ***

Zone6:Bonus 0.038753 0.005006 7.741 9.85e-15 ***

Zone7:Bonus 0.038342 0.017200 2.229 0.025806 *

Make2:Bonus 0.049883 0.010368 4.811 1.50e-06 ***

Make3:Bonus 0.023286 0.012659 1.840 0.065840 .

Make4:Bonus 0.064240 0.010080 6.373 1.85e-10 ***

Make5:Bonus -0.003098 0.008620 -0.359 0.719332

Make6:Bonus 0.053004 0.007371 7.190 6.46e-13 ***

Make7:Bonus 0.053923 0.010943 4.927 8.33e-07 ***

Make8:Bonus 0.022022 0.016206 1.359 0.174194

Make9:Bonus 0.053380 0.004361 12.241 < 2e-16 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 34070.6 on 2181 degrees of freedom

Residual deviance: 6528.3 on 2152 degrees of freedom

AIC: 14226

Number of Fisher Scoring iterations: 4

MODELO FINAL:

Coefficients:

Estimate Std. Error z value Pr(>|z|)

(Intercept) -1.217342 0.023799 -51.151 < 2e-16 ***

dummy$Zone2 -0.346791 0.020823 -16.654 < 2e-16 ***

dummy$Zone3 -0.500734 0.021430 -23.366 < 2e-16 ***

dummy$Zone4 -0.728840 0.019300 -37.764 < 2e-16 ***

dummy$Zone5 -0.517111 0.033035 -15.653 < 2e-16 ***

dummy$Zone6 -0.708293 0.026943 -26.289 < 2e-16 ***

dummy$Zone7 -0.926013 0.093842 -9.868 < 2e-16 ***

dummy$Make2 -0.140523 0.059831 -2.349 0.018840 *

dummy$Make3 -0.293226 0.075737 -3.872 0.000108 ***

dummy$Make4 -1.090577 0.044189 -24.680 < 2e-16 ***

dummy$Make5 0.133406 0.020250 6.588 4.46e-11 ***

dummy$Make6 -0.645539 0.038094 -16.946 < 2e-16 ***

dummy$Make7 -0.351535 0.062749 -5.602 2.12e-08 ***

dummy$Make8 0.066609 0.031575 2.110 0.034897 *

dummy$Make9 -0.373195 0.021700 -17.198 < 2e-16 ***

Bonus -0.258649 0.004290 -60.288 < 2e-16 ***

I(dummy$Zone2 * Bonus) 0.023910 0.003973 6.018 1.77e-09 ***

I(dummy$Zone3 * Bonus) 0.024224 0.004057 5.972 2.35e-09 ***

I(dummy$Zone4 * Bonus) 0.031829 0.003633 8.762 < 2e-16 ***

I(dummy$Zone5 * Bonus) 0.038086 0.006168 6.175 6.61e-10 ***

I(dummy$Zone6 * Bonus) 0.038868 0.005006 7.765 8.17e-15 ***

I(dummy$Zone7 * Bonus) 0.038362 0.017201 2.230 0.025730 *

I(dummy$Make2 * Bonus) 0.049411 0.010148 4.869 1.12e-06 ***

I(dummy$Make3 * Bonus) 0.022816 0.012479 1.828 0.067493 .

I(dummy$Make4 * Bonus) 0.063770 0.009853 6.472 9.67e-11 ***

I(dummy$Make6 * Bonus) 0.052527 0.007058 7.443 9.87e-14 ***

I(dummy$Make7 * Bonus) 0.053449 0.010735 4.979 6.39e-07 ***

I(dummy$Make9 * Bonus) 0.052907 0.003807 13.898 < 2e-16 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 34070.6 on 2181 degrees of freedom

Residual deviance: 6530.4 on 2154 degrees of freedom

AIC: 14224

Number of Fisher Scoring iterations: 4

PROBLEMA: MUY MAL AJUSTE.

POSIBLE CAUSA: SOBREDISPERSIÓN.

4B)MODELO DE REGRESIÓN BINOMIAL NEGATIVA: FRECUENCIA

(SOBREDISPERSIÓN)

OFFSET: log(Insured)

MODELO INICIAL:

Coefficients:

Estimate Std. Error z value Pr(>|z|)

(Intercept) -1.293984 0.056722 -22.813 < 2e-16 ***

dummy$Zone2 -0.336044 0.059090 -5.687 1.29e-08 ***

dummy$Zone3 -0.461724 0.059494 -7.761 8.43e-15 ***

dummy$Zone4 -0.669928 0.056897 -11.774 < 2e-16 ***

dummy$Zone5 -0.565145 0.072162 -7.832 4.82e-15 ***

dummy$Zone6 -0.682021 0.064786 -10.527 < 2e-16 ***

dummy$Zone7 -0.874743 0.125036 -6.996 2.64e-12 ***

dummy$Make2 -0.132673 0.080773 -1.643 0.100478

dummy$Make3 -0.313009 0.095060 -3.293 0.000992 ***

dummy$Make4 -1.028398 0.076786 -13.393 < 2e-16 ***

dummy$Make5 0.172831 0.071460 2.419 0.015581 *

dummy$Make6 -0.612087 0.066626 -9.187 < 2e-16 ***

dummy$Make7 -0.288240 0.083581 -3.449 0.000563 ***

dummy$Make8 -0.031708 0.111370 -0.285 0.775867

dummy$Make9 -0.287987 0.054168 -5.317 1.06e-07 ***

Bonus -0.243607 0.011837 -20.580 < 2e-16 ***

I(dummy$Zone2 * Bonus) 0.026005 0.012227 2.127 0.033426 *

I(dummy$Zone3 * Bonus) 0.019706 0.012277 1.605 0.108482

I(dummy$Zone4 * Bonus) 0.028186 0.011776 2.393 0.016689 *

I(dummy$Zone5 * Bonus) 0.042402 0.014534 2.917 0.003529 **

I(dummy$Zone6 * Bonus) 0.033581 0.013177 2.548 0.010819 *

I(dummy$Zone7 * Bonus) 0.018074 0.024600 0.735 0.462519

I(dummy$Make2 * Bonus) 0.049001 0.015675 3.126 0.001771 **

I(dummy$Make3 * Bonus) 0.030717 0.017603 1.745 0.080982 .

I(dummy$Make4 * Bonus) 0.058438 0.017061 3.425 0.000614 ***

I(dummy$Make5 * Bonus) -0.005051 0.014590 -0.346 0.729177

I(dummy$Make6 * Bonus) 0.049961 0.013871 3.602 0.000316 ***

I(dummy$Make7 * Bonus) 0.042176 0.016297 2.588 0.009654 **

I(dummy$Make8 * Bonus) 0.015368 0.020242 0.759 0.447718

I(dummy$Make9 * Bonus) 0.054562 0.011470 4.757 1.96e-06 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Negative Binomial(25.4515) family taken to

be 1)

Null deviance: 6173.6 on 2181 degrees of freedom

Residual deviance: 2145.9 on 2152 degrees of freedom

AIC: 10970

Number of Fisher Scoring iterations: 1

Theta: 25.45

Std. Err.: 1.88

2 x log-likelihood: -10907.93

MODELO FINAL:

Coefficients:

Estimate Std. Error z value Pr(>|z|)

(Intercept) -1.410179 0.035799 -39.391 < 2e-16 ***

dummy$Zone2 -0.224255 0.026550 -8.447 < 2e-16 ***

dummy$Zone3 -0.376511 0.026694 -14.105 < 2e-16 ***

dummy$Zone4 -0.547450 0.025485 -21.481 < 2e-16 ***

dummy$Zone5 -0.378085 0.031798 -11.890 < 2e-16 ***

dummy$Zone6 -0.534189 0.028880 -18.497 < 2e-16 ***

dummy$Zone7 -0.799159 0.056316 -14.191 < 2e-16 ***

dummy$Make2 -0.105520 0.075557 -1.397 0.162543

dummy$Make3 -0.167209 0.036335 -4.602 4.19e-06 ***

dummy$Make4 -1.005102 0.071443 -14.069 < 2e-16 ***

dummy$Make5 0.140875 0.031164 4.520 6.17e-06 ***

dummy$Make6 -0.599903 0.060381 -9.935 < 2e-16 ***

dummy$Make7 -0.271347 0.078673 -3.449 0.000563 ***

dummy$Make9 -0.273990 0.045937 -5.964 2.45e-09 ***

Bonus -0.214524 0.005999 -35.760 < 2e-16 ***

I(dummy$Make2 * Bonus) 0.041209 0.014235 2.895 0.003792 **

I(dummy$Make4 * Bonus) 0.050973 0.015780 3.230 0.001237 **

I(dummy$Make6 * Bonus) 0.044847 0.012245 3.663 0.000250 ***

I(dummy$Make7 * Bonus) 0.036240 0.014936 2.426 0.015251 *

I(dummy$Make9 * Bonus) 0.049116 0.009361 5.247 1.55e-07 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Negative Binomial(24.9859) family taken to

be 1)

Null deviance: 6119.9 on 2181 degrees of freedom

Residual deviance: 2151.0 on 2162 degrees of freedom

AIC: 10967

Number of Fisher Scoring iterations: 1

Theta: 24.99

Std. Err.: 1.84

2 x log-likelihood: -10924.67

EJEMPLOS DE FACTORES DE TARIFICACIÓN:

exp(-.214524)=exp(beta_bonus)=0.8069254

exp(2*(-.214524)) =exp(2*beta_bonus)= 0.6511287

exp(3*(-.214524)) =exp(3*beta_bonus)= 0.5254123

exp(0.140875)=exp(dummy$Make5)= 1.151281

exp(-1.005102)=exp(dummy$Make4)= 0.3660073

por ejemplo, para un asegurado con Make=7, bonus=3, Zone=7:

exp(dummy$Make7 * Bonus(3) + Bonus(3)+ dummy$Make7 + dummy$Zone7)=

exp(0.036240*3 -0.214524 *3-0.271347 -0.799159)=exp(-1.605358)= 0.2008176

4)MODELO DE REGRESIÓN GAMMA: SEVERIDAD

OFFSET: log(Claims)

MODELO INICIAL

MODELO INICIAL:

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 8.3258777 0.1586479 52.480 <2e-16 ***

dummy$Zone2 -0.0822650 0.1525932 -0.539 0.5899

dummy$Zone3 0.1186210 0.1515616 0.783 0.4339

dummy$Zone4 0.1513074 0.1509229 1.003 0.3162

dummy$Zone5 0.1158077 0.1640002 0.706 0.4802

dummy$Zone6 0.1831158 0.1575154 1.163 0.2452

dummy$Zone7 -0.0919868 0.2148388 -0.428 0.6686

dummy$Make2 0.0348368 0.1812066 0.192 0.8476

dummy$Make3 0.3288051 0.1896622 1.734 0.0832 .

dummy$Make4 0.1817264 0.1864432 0.975 0.3298

dummy$Make5 0.1117441 0.1770404 0.631 0.5280

dummy$Make6 0.1101039 0.1774434 0.621 0.5350

dummy$Make7 -0.0781639 0.1834040 -0.426 0.6700

dummy$Make8 0.4232748 0.1909293 2.217 0.0268 *

dummy$Make9 0.0276298 0.1711544 0.161 0.8718

Bonus 0.0379201 0.0350455 1.082 0.2794

I(dummy$Zone2 * Bonus) 0.0026408 0.0337621 0.078 0.9377

I(dummy$Zone3 * Bonus) -0.0222333 0.0335572 -0.663 0.5077

I(dummy$Zone4 * Bonus) -0.0160044 0.0335013 -0.478 0.6329

I(dummy$Zone5 * Bonus) 0.0005123 0.0354181 0.014 0.9885

I(dummy$Zone6 * Bonus) -0.0065242 0.0344025 -0.190 0.8496

I(dummy$Zone7 * Bonus) 0.0292067 0.0444205 0.658 0.5109

I(dummy$Make2 * Bonus) -0.0192713 0.0394542 -0.488 0.6253

I(dummy$Make3 * Bonus) -0.0408060 0.0407611 -1.001 0.3169

I(dummy$Make4 * Bonus) -0.0562166 0.0412710 -1.362 0.1733

I(dummy$Make5 * Bonus) -0.0301734 0.0391372 -0.771 0.4408

I(dummy$Make6 * Bonus) -0.0216627 0.0391762 -0.553 0.5804

I(dummy$Make7 * Bonus) 0.0010806 0.0400084 0.027 0.9785

I(dummy$Make8 * Bonus) -0.0402547 0.0411310 -0.979 0.3279

I(dummy$Make9 * Bonus) -0.0175537 0.0379319 -0.463 0.6436

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Gamma family taken to be 0.6819326)

Null deviance: 899.37 on 1796 degrees of freedom

Residual deviance: 866.55 on 1767 degrees of freedom

AIC: 42373

Number of Fisher Scoring iterations: 7

MODELO FINAL:

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 8.51167 0.02537 335.537 < 2e-16 ***

dummy$Zone2 -0.13064 0.05362 -2.436 0.01493 *

dummy$Zone6 0.09550 0.05613 1.701 0.08904 .

dummy$Make3 0.18883 0.06576 2.871 0.00414 **

dummy$Make8 0.28775 0.06658 4.322 1.63e-05 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Gamma family taken to be 0.684663)

Null deviance: 899.37 on 1796 degrees of freedom

Residual deviance: 874.51 on 1792 degrees of freedom

AIC: 42341

Number of Fisher Scoring iterations: 6

MODELO FINAL LOGNORMAL YA QUE EL MODELO GAMMA PRESENTA PROBLEMAS

EN EL ANÁLISIS DE RESIDUALES

OFFSET ln(Claims) – ln(Insured)

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 14.3074 0.1141 125.360 < 2e-16 ***

dummy$Zone3 0.2324 0.1086 2.139 0.032535 *

dummy$Zone4 0.9436 0.1069 8.825 < 2e-16 ***

dummy$Zone5 -0.7525 0.1181 -6.374 2.34e-10 ***

dummy$Zone7 -2.6751 0.1661 -16.106 < 2e-16 ***

dummy$Make2 -1.5387 0.1547 -9.946 < 2e-16 ***

dummy$Make3 -1.6495 0.1611 -10.241 < 2e-16 ***

dummy$Make4 -1.1665 0.1673 -6.974 4.31e-12 ***

dummy$Make5 -1.6179 0.1546 -10.466 < 2e-16 ***

dummy$Make6 -0.5635 0.1533 -3.676 0.000244 ***

dummy$Make7 -1.7932 0.1565 -11.460 < 2e-16 ***

dummy$Make8 -2.1962 0.1625 -13.515 < 2e-16 ***

dummy$Make9 2.0419 0.1482 13.781 < 2e-16 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.604 on 1784 degrees of freedom

Multiple R-squared: 0.4786, Adjusted R-squared: 0.4751

F-statistic: 136.4 on 12 and 1784 DF, p-value: < 2.2e-16

FACTORES DE TARIFICACIÓN DEL MODELO LOGNORMAL:

dummy$Zone3 1.261646

dummy$Zone4 2.569209

dummy$Zone5 0.471191

dummy$Zone7 0.068902

dummy$Make2 0.214661

dummy$Make3 0.192155

dummy$Make4 0.311443

dummy$Make5 0.198311

dummy$Make6 0.569236

dummy$Make7 0.166418

dummy$Make8 0.111222

dummy$Make9 7.705244