Intro

BLPestimatoR provides an efficient estimation algorithm to perform the demand estimation described in Berry, Levinsohn, and Pakes (1995). The routine uses analytic gradients and offers a large number of optimization routines and implemented integration methods as discussed in Brunner et al. (2017).

This extended documentation demonstrates the steps of a typical demand estimation with the package:

• prepare the data with BLP_data (includes the specification of a model and providing integration draws for observed or unobserved heterogeneity)
• estimate the parameters with estimate_BLP
• showing the results with summary
• view the own- and crosspriceelasticities with get_elasticities
• perform a hypothetical merger analysis

For this purpose the well-known training datasets for the cereal market (Nevo 2001) and the car market (Berry, Levinsohn, and Pakes 1995) are included in the package. Loading the package is therefore the very first step of the demand estimation:

library(BLPestimatoR)

Data

Model

Since version 0.1.6 the model is provided in R’s formula syntax and consists of five parts. The variable to be explained is given by observed market shares. Explanatory variables are grouped into four (possibly overlapping) categories separated by |:

• linear variables
• exogenous variables
• random coefficients
• instruments

The first part of this documentation starts with the cereal data example from Nevo (2001). Nevo’s model can be translated into the following formula syntax:

The model is directly related to consumer $$i$$’s indirect utility from purchasing cereal $$j$$ in market $$t$$:

$u_{ijt}=\sum_{m=1}^M x^{(m)}_{jt} \beta_{i,m}+\xi_{jt}+\epsilon_{ijt} \;\; \text{with}$ $\beta_{i,m}= \bar{\beta}_m + \sum_{r=1}^R \gamma_{m,r} d_{i,r} + \sigma_m \nu_{i,m}$ and

• M = 4 random coefficients (price, sugar, mushy and an intercept)
• R = 4 demographics (income, incomesq, age, child)
• and the set of non-linear parameters to estimate:

$\theta_2 = \begin{pmatrix} \sigma_1 & \gamma_{1,1} & \cdots & \gamma_{1,R} \\ \sigma_2 & \gamma_{2,1} & \cdots & \gamma_{2,R} \\ \vdots & \vdots & \ddots & \vdots \\ \sigma_M & \gamma_{M,1} & \cdots & \gamma_{M,R} \end{pmatrix}$

Dataframe

Product related variables are collected in the dataframe productData with the following requirements:

• missings are not allowed
• character variables are automatically transformed to a set of dummy variables
• a variable that describes market affiliation (market_identifier)

A variable that uniquely identifies a product in a market (product_identifier) is optional, but enhances clarity (interpreting elasticities, for example, is much easier). market_identifier and product_identifier together uniquely identify an observation, which is used by the function update_BLP_data to update any variable in the data (in this case product_identifier is mandatory).

In the cereal example, this gives the following dataframe:

#>        price const sugar mushy       share     cdid        IV1        IV2
#> 1 0.07208794     1     2     1 0.012417212 market_1 -0.2159728 0.04057341
#> 2 0.11417849     1    18     1 0.007809387 market_1 -0.2452393 0.05474226
#> 3 0.13239066     1     4     1 0.012994511 market_1 -0.1764587 0.04659597
#> 4 0.13034408     1     3     0 0.005769961 market_1 -0.1214013 0.04876037
#> 5 0.15482331     1    12     0 0.017934141 market_1 -0.1326114 0.03962835
#> 6 0.13704921     1    14     0 0.026601892 market_1 -0.1534998 0.04298842
#>          IV3          IV4         IV5           IV6        IV7         IV8
#> 1  -3.247948 -0.523937695 -0.23246005  0.0068326605  3.1397395 -0.57478633
#> 2 -19.832461 -0.180519694  0.01468859  0.0007988026  0.2876539  0.03293960
#> 3  -2.878531 -0.284219004 -0.21553691 -0.0318693281  2.8862741 -0.74976495
#> 4  -2.059918 -0.328412257 -0.22206995 -0.0314740402  4.4531096  0.25567529
#> 5  -6.137598 -0.138625095 -0.18936521 -0.0437471023 -3.5546508  0.13882114
#> 6  -8.417332  0.007829087 -0.13850121 -0.0210582272 -2.7594799  0.05020052
#>          IV9       IV10       IV11        IV12          IV13       IV14
#> 1  0.2062202  0.1774656  2.1163580 -0.15470824 -0.0057964065 0.01453802
#> 2  0.1051208 -0.2875618 -7.3740909 -0.57641176  0.0129908544 0.07614324
#> 3 -0.4789565  0.2147389  2.1878721 -0.20734643  0.0035092777 0.09178117
#> 4 -0.4729673  0.3560980  2.7045762  0.04074801 -0.0037242656 0.09473168
#> 5 -0.6886784  0.2602726  1.2612419  0.03483558 -0.0005676374 0.10245147
#> 6 -0.2734440  0.1273060  0.3375543  0.02351037  0.0002637777 0.08627983
#>         IV15       IV16       IV17       IV18       IV19       IV20
#> 1 0.12624398 0.06734464 0.06842261 0.03480046 0.12634612 0.03548368
#> 2 0.02973565 0.08786672 0.11050060 0.08778380 0.04987192 0.07257905
#> 3 0.16377308 0.11188073 0.10822551 0.08643905 0.12234707 0.10184248
#> 4 0.13527378 0.08809001 0.10176745 0.10177748 0.11074119 0.10433204
#> 5 0.13063951 0.08481820 0.10107461 0.12516923 0.13346381 0.12111110
#> 6 0.07233581 0.02225051 0.10564387 0.11603699 0.09965063 0.10572660
#>   product_id productdummy
#> 1   cereal_1     product1
#> 2   cereal_2     product2
#> 3   cereal_3     product3
#> 4   cereal_4     product4
#> 5   cereal_5     product5
#> 6   cereal_6     product6

Integration Draws

The arguments related to the numerical integration problem are of particular importance when providing own integration draws and weights, which is most relevant for observed heterogeneity (for unobserved heterogeneity, the straightforward approach is the use of automatic integration).

In the cereal data, both, observed and unobserved heterogeneity, is used for the random coefficients. Starting with observed heterogeneity, user provided draws are collected in a list. Each list entry must be named according to the name of a demographic. Each entry contains the following variables:

• a variable market_identifier that matches each line to a market (same variable name as in productData)
• integration draws for each market

In the cereal example, observed heterogeneity is provided as follows (list names correspond to the demographics):

If demographic input (demographicData) is missing, the estimation routine considers only coefficients for unobserved heterogeneity. This can be done by already implemented integration methods via integration_method as shown in the estimation section. In Nevo’s cereal example however, a specific set of 20 draws is given. For this situation, draws are also provided as a list (list names correspond to the formula’s random coefficients and each list entry has a variable market_identifier):

As demonstrated above, list entries for draws of constants must be named (Intercept). Other names of list entries must match the random coefficients specified in the formula.

Calling BLP_data

Calling BLP_data structures and prepares the data for estimation and creates the data object:

The arguments in greater detail:

• model provides the utility model as explained above

• market_identifier gives the name of the variable in productData that matches each observation to a market

• product_identifier gives the name of the variable in productData that matches each observation to a product (must be unique in a market)

• productData is given as a dataframe and demographicData as a list as described above

• par_delta gives the name of the variable in productData for mean utilities

• blp_inner_tol , blp_inner_maxit: arguments related to be BLP algorithm include the convergence threshold and the maximum number of iterations in the contraction mapping

• if integration draws are provided manually, integration_draws and integration_weights need to be specified

• for automatic integration the user specifies integration_method, for example integration_method= "MLHS", and the accuracy of the integration method by integration_accuracy (for stochastic integration methods this equals the number of draws)

If you decide to update your data later, you can use the function update_BLP_data.

Postestimation

Standard Errors

Standard errors can be computed with three options that control for the unobserved characteristic $$\xi$$, which consists of $$N$$ elements. $$\Omega$$ denotes the variance covariance matrix of $$\xi$$.

• option homoskedastic requires the standard deviation $$\sigma_i$$ for each $$\xi_i \;\forall i\in 1,\cdots,N$$ to be identical: $\Omega = \begin{pmatrix} \sigma & 0 & \dots & 0\\ 0 & \sigma & & 0\\ \vdots & & \ddots & 0\\ 0 & 0 & 0 & \sigma \\ \end{pmatrix}$

• option heteroskedastic allows for individual standard deviations $$\sigma_i$$ for each $$\xi_i$$ : $\Omega = \begin{pmatrix} \sigma_1 & 0 & \dots & 0\\ 0 & \sigma_2 & & 0\\ \vdots & & \ddots & 0\\ 0 & 0 & 0 & \sigma_N \\ \end{pmatrix}$

• option cluster allows for cluster individual variance covariance matrices in each of $$M$$ cluster groups. For this option the argument group_structure needs to be specified in the function BLP_data to determine the cluster group. This gives the block-diagonal form with $$\Sigma_m$$ as the variance covariance matrix for all $$\xi_i$$ in cluster $$m$$:

$\Omega = \begin{pmatrix} \Sigma_1 & 0 & \dots & 0\\ 0 & \Sigma_2 & & 0\\ \vdots & & \ddots & 0\\ 0 & 0 & 0 & \Sigma_M \\ \end{pmatrix}$

Elasticities

The following code demonstrates the calculation of elasticities for the estimation object cereal_est.

The value of the elasticity matrix in row $$j$$ and column $$i$$ for a variable $$x$$, gives the effect of a change in product $$i$$’s characteristic $$x$$ on the share of product $$j$$.

Modular Examples

Further analysis like incorporating a supply side or performing a merger simulation often requires access to building blocks of the BLP algorithm. The following wrappers insure correct data inputs and access the internal functions of the algorithm.

In the following, you find an example of the contraction mapping and an evaluation of the GMM function at the starting guess:

delta_eval <- getDelta_wrap(
blp_data = cereal_data,
par_theta2 = theta_guesses_cereal,
printLevel = 4
)
#> ----------------------
#>   dist: 0.230998
#>   dist: 0.188133
#>   dist: 0.161181
#>   dist: 0.133303
#>   dist: 0.107161
#>   dist: 0.0844808
#>   dist: 0.0657196
#>   dist: 0.0506556
#>   dist: 0.0387915
#>   dist: 0.0295683
#>   dist: 0.0224622
#>   dist: 0.0170219
#>   dist: 0.0128758
#>   dist: 0.00972657
#>   dist: 0.00734024
#>   dist: 0.00553524
#>   dist: 0.00417178
#>   dist: 0.00314287
#>   dist: 0.00236698
#>   dist: 0.00178222
#>   dist: 0.00134169
#>   dist: 0.00100991
#>   dist: 0.000760105
#>   dist: 0.000572046
#>   dist: 0.000430491
#>   dist: 0.00032395
#>   dist: 0.000243769
#>   dist: 0.00019173
#>   dist: 0.000156244
#>   dist: 0.000127321
#>   dist: 0.000103748
#>   dist: 8.45372e-05
#>   dist: 6.88821e-05
#>   dist: 5.61251e-05
#>   dist: 4.573e-05
#>   dist: 3.72598e-05
#>   dist: 3.03582e-05
#>   dist: 2.47347e-05
#>   dist: 2.01528e-05
#>   dist: 1.64195e-05
#>   dist: 1.33778e-05
#>   dist: 1.08995e-05
#>   dist: 8.88033e-06
#>   dist: 7.23518e-06
#>   dist: 5.8948e-06
#>   dist: 4.80272e-06
#>   dist: 3.91296e-06
#>   dist: 3.18804e-06
#>   dist: 2.59741e-06
#>   dist: 2.14713e-06
#>   dist: 1.78121e-06
#>   dist: 1.47765e-06
#>   dist: 1.22582e-06
#>   dist: 1.01691e-06
#>   dist: 9.93231e-07

productData_cereal$startingGuessesDelta[1:6] #>  -3.944367 -2.845205 -3.958199 -4.934153 -2.425356 -4.086816 delta_eval$delta[1:6]
#> cereal_1_market_1 cereal_2_market_1 cereal_3_market_1 cereal_4_market_1
#>         -7.069801         -4.357675         -6.056909         -5.887501
#> cereal_5_market_1 cereal_6_market_1
#>         -3.501289         -3.079323
delta_eval$counter #>  55 gmm <- gmm_obj_wrap( blp_data = cereal_data, par_theta2 = theta_guesses_cereal, printLevel = 2 ) #> gmm objective: 29.3544 #> theta (RC): 0.33 2.45 0.02 0.24 #> theta (demogr.): 5.48 15.89 -0.25 1.26 0 -1.2 0 0 0.2 0 0.05 -0.81 0 2.63 0 0 #> inner iterations: 55 #> gradient: 9.8468 0.3171 363.5319 16.3598 10.6031 0.7028 42.5236 -3.4759 13.4982 -2.0231 10.9331 1.2851 -0.5714 #>  29.35439 gmm$local_min
#>  29.35439

Printed distances in the contraction mapping are maximum absolute distances between the current vector of mean utilities and the previous one.

For any $$\theta_2$$, you can compute predicted shares:

shareObj <- getShareInfo(
blp_data = cereal_data,
par_theta2 = theta_guesses_cereal,
printLevel = 4
)
#> Mean utility (delta) is used as provided in the BLP_data() function.

shareObj$shares[1:6] #> cereal_1_market_1 cereal_2_market_1 cereal_3_market_1 cereal_4_market_1 #> 0.052856545 0.002175260 0.006105588 0.001529422 #> cereal_5_market_1 cereal_6_market_1 #> 0.001571742 0.003654192 The object contains a list of outputs that are useful for further economic analysis. For example, the list element sij contains share probabilities for every individual and needs to be given to calculate elasticities. The gradient contains two important building blocks as explained in the appendix of Nevo (2001): • $$\frac{\partial s_{ijt}}{\partial \theta_2}$$ , i.e. the derivative of individual $$i$$’s share of product $$j$$ in market $$t$$ with respect to non-linear parameters • $$\frac{\partial s_{ijt}}{\partial \delta}$$ , i.e. the derivative of individual $$i$$’s share of product $$j$$ in market $$t$$ with respect to mean utilities Both are used to compute the jacobian and are easy to obtain with the package as the following example demonstrates: # market 2: derivatives1 <- dstdtheta_wrap( blp_data = cereal_data, par_theta2 = theta_guesses_cereal, market = "market_2" ) #> Mean utility (delta) is used as provided in the BLP_data() function. derivatives2 <- dstddelta_wrap( blp_data = cereal_data, par_theta2 = theta_guesses_cereal, market = "market_2" ) #> Mean utility (delta) is used as provided in the BLP_data() function. jac_mkt2 <- -solve(derivatives2) %*% derivatives1 jac_mkt2[1:5, 1:4] #> unobs_sd*(Intercept) unobs_sd*price unobs_sd*sugar #> meanUtility_cereal_1 -0.3935523 0.010936676 -1.430109 #> meanUtility_cereal_2 -0.5385401 0.002959190 -8.608901 #> meanUtility_cereal_3 -0.4567829 0.003935221 -2.612346 #> meanUtility_cereal_4 -0.7766701 -0.022430917 -1.567758 #> meanUtility_cereal_5 -0.7821670 -0.085051524 -2.722660 #> unobs_sd*mushy #> meanUtility_cereal_1 -0.47865751 #> meanUtility_cereal_2 -0.20678073 #> meanUtility_cereal_3 -0.43499042 #> meanUtility_cereal_4 -0.04046717 #> meanUtility_cereal_5 -0.04388127 # all markets jacobian_nevo <- getJacobian_wrap( blp_data = cereal_data, par_theta2 = theta_guesses_cereal, printLevel = 2 ) #> Mean utility (delta) is used as provided in the BLP_data() function. jacobian_nevo[25:29, 1:4] # compare to jac_mkt2 #> unobs_sd*(Intercept) unobs_sd*price unobs_sd*sugar #> cereal_1_market_2 -0.3935523 0.010936676 -1.430109 #> cereal_2_market_2 -0.5385401 0.002959190 -8.608901 #> cereal_3_market_2 -0.4567829 0.003935221 -2.612346 #> cereal_4_market_2 -0.7766701 -0.022430917 -1.567758 #> cereal_5_market_2 -0.7821670 -0.085051524 -2.722660 #> unobs_sd*mushy #> cereal_1_market_2 -0.47865751 #> cereal_2_market_2 -0.20678073 #> cereal_3_market_2 -0.43499042 #> cereal_4_market_2 -0.04046717 #> cereal_5_market_2 -0.04388127 Another Example: Merger Analysis with BLP’s car data Analyzing a hypothetical merger is demonstrated by the car data of Berry, Levinsohn, and Pakes (1995). In this case, the preparation of product data comprises the computation of instruments as a function of product characteristics of competitors’ products (for details, check Berry, Levinsohn, and Pakes (1995)). This example is based on data and documentation of Knittel and Metaxoglou (2014). # add owner matix to productData own_pre <- dummies_cars colnames(own_pre) <- paste0("company", 1:26) productData_cars <- cbind(productData_cars, own_pre) # construct instruments nobs <- nrow(productData_cars) X <- data.frame( productData_cars$const, productData_cars$hpwt, productData_cars$air, productData_cars$mpg, productData_cars$space
)

sum_other <- matrix(NA, nobs, ncol(X))
sum_rival <- matrix(NA, nobs, ncol(X))
sum_total <- matrix(NA, nobs, ncol(X))

for (i in 1:nobs) {
other_ind <- productData_cars$firmid == productData_cars$firmid[i] &
productData_cars$cdid == productData_cars$cdid[i] &
productData_cars$id != productData_cars$id[i]
rival_ind <- productData_cars$firmid != productData_cars$firmid[i] &
productData_cars$cdid == productData_cars$cdid[i]
total_ind <- productData_cars$cdid == productData_cars$cdid[i]

sum_other[i, ] <- colSums(X[other_ind == 1, ])
sum_rival[i, ] <- colSums(X[rival_ind == 1, ])
sum_total[i, ] <- colSums(X[total_ind == 1, ])
}

colnames(sum_other) <- paste0("IV", 1:5)
colnames(sum_rival) <- paste0("IV", 6:10)
productData_cars <- cbind(productData_cars, sum_other, sum_rival)
#>   firmid cdid  id const    price      hpwt air   mpg  space        share
#> 1     15    1 129     1 4.935802 0.5289969   0 1.697 1.1502 0.0010512928
#> 2     15    1 130     1 5.516049 0.4943244   0 1.740 1.2780 0.0006700762
#> 3     15    1 132     1 7.108642 0.4676134   0 1.543 1.4592 0.0003405273
#> 4     15    1 134     1 6.839506 0.4265403   0 1.517 1.6068 0.0005222419
#> 5     15    1 136     1 8.928395 0.4524887   0 1.352 1.6458 0.0004424231
#> 6     19    1 138     1 7.153086 0.4508706   0 1.552 1.6224 0.0027560592
#>   company1 company2 company3 company4 company5 company6 company7 company8
#> 1        0        0        0        0        0        0        0        0
#> 2        0        0        0        0        0        0        0        0
#> 3        0        0        0        0        0        0        0        0
#> 4        0        0        0        0        0        0        0        0
#> 5        0        0        0        0        0        0        0        0
#> 6        0        0        0        0        0        0        0        0
#>   company9 company10 company11 company12 company13 company14 company15
#> 1        0         0         0         0         0         0         1
#> 2        0         0         0         0         0         0         1
#> 3        0         0         0         0         0         0         1
#> 4        0         0         0         0         0         0         1
#> 5        0         0         0         0         0         0         1
#> 6        0         0         0         0         0         0         0
#>   company16 company17 company18 company19 company20 company21 company22
#> 1         0         0         0         0         0         0         0
#> 2         0         0         0         0         0         0         0
#> 3         0         0         0         0         0         0         0
#> 4         0         0         0         0         0         0         0
#> 5         0         0         0         0         0         0         0
#> 6         0         0         0         1         0         0         0
#>   company23 company24 company25 company26 IV1       IV2 IV3    IV4     IV5
#> 1         0         0         0         0   4  1.840967   0  6.152  5.9898
#> 2         0         0         0         0   4  1.875639   0  6.109  5.8620
#> 3         0         0         0         0   4  1.902350   0  6.306  5.6808
#> 4         0         0         0         0   4  1.943423   0  6.332  5.5332
#> 5         0         0         0         0   4  1.917475   0  6.497  5.4942
#> 6         0         0         0         0  28 16.964809   0 44.328 46.1233
#>   IV6      IV7 IV8     IV9     IV10
#> 1  87 44.55554   0 150.386 125.5613
#> 2  87 44.55554   0 150.386 125.5613
#> 3  87 44.55554   0 150.386 125.5613
#> 4  87 44.55554   0 150.386 125.5613
#> 5  87 44.55554   0 150.386 125.5613
#> 6  63 29.50982   0 112.355  84.9556

# To show similarities between implementations of other authors,
# the variable "const" is used, although constants are considered by default.
blps_model <- as.formula("share ~  0 + const + price + hpwt + air + mpg + space |
0 + const + hpwt + air + mpg + space |
0 + price + const + hpwt + air + mpg |
0 + IV1 + IV2 + IV3 + IV4 + IV5 + IV6 + IV7 + IV8 + IV9 + IV10")

car_data <- BLP_data(
model = blps_model,
market_identifier = "cdid",
product_identifier = "id",
additional_variables = paste0("company", 1:26), # check reordering works
productData = productData_cars,
blp_inner_tol = 1e-9,
blp_inner_maxit = 5000,
integration_method = "MLHS",
integration_accuracy = 50, integration_seed = 48
)
#> Mean utility (variable name: delta) is initialized with 0 because of missing or invalid par_delta argument.

In the next step, starting guesses for random coefficients are generated from a standard normal distribution. The estimation of the model works like before.

set.seed(121)
theta_guesses <- matrix(rnorm(5))
rownames(theta_guesses) <- c("price", "const", "hpwt", "air", "mpg")
colnames(theta_guesses) <- "unobs_sd"

car_est <- estimateBLP(
blp_data = car_data,
par_theta2 = theta_guesses,
solver_method = "BFGS", solver_maxit = 1000, solver_reltol = 1e-6,
extremumCheck = FALSE, printLevel = 0
)
#> blp_data were prepared with the following arguments:
#> BLP_data(model = blps_model, market_identifier = "cdid", product_identifier = "id",
#>     additional_variables = paste0("company", 1:26), productData = productData_cars,
#>     integration_accuracy = 50, integration_method = "MLHS", integration_seed = 48,
#>     blp_inner_tol = 1e-09, blp_inner_maxit = 5000)
#> ------------------------------------------
#> Solver message: Successful convergence
#> ------------------------------------------
#> Final GMM evaluation at optimal parameters:
#> gmm objective: 131.2733
#>   theta (RC): -0.42 -8.59 1.27 -0.1 1.07
#>   theta (demogr.):
#>   inner iterations: 67
#>   gradient: -18.916 3.6499 27.2623 -8.5398 -28.9531
#> Using the heteroskedastic asymptotic variance-covariance matrix...

summary(car_est)
#>
#> Data information:
#>
#>   20 market(s) with 2217 products
#>   6 linear coefficient(s) (5 exogenous coefficients)
#>   5 non-linear parameters related to random coefficients
#>   0 demographic variable(s)
#>
#> Estimation results:
#>
#>  Linear Coefficients
#>          Estimate Std. Error    t value     Pr(>|t|)
#> const -13.0980731  1.0748672 -12.185759 3.701895e-34
#> price  -0.9049313  0.1609564  -5.622213 1.885269e-08
#> hpwt    2.1752674  0.8598893   2.529706 1.141581e-02
#> air     1.4011076  0.2567444   5.457207 4.836817e-08
#> mpg    -0.7994763  0.6366915  -1.255673 2.092345e-01
#> space   3.1465811  0.3542019   8.883580 6.474222e-19
#>
#>  Random Coefficients
#>                  Estimate Std. Error    t value     Pr(>|t|)
#> unobs_sd*price -0.4150985 0.08593854 -4.8301784 1.364108e-06
#> unobs_sd*const -8.5907908 2.19917403 -3.9063715 9.369243e-05
#> unobs_sd*hpwt   1.2712851 0.41076258  3.0949390 1.968534e-03
#> unobs_sd*air   -0.1005375 0.67001283 -0.1500531 8.807227e-01
#> unobs_sd*mpg    1.0652496 0.48637023  2.1902032 2.850950e-02
#>
#>  Wald Test
#> 45.1733 on  5 DF, p-value: 1.33780388852691e-08
#>
#> Computational Details:
#>   Solver converged with 187 iterations to a minimum at 131.2733 .
#>   Local minima check: NA
#>   stopping criterion outer loop: 1e-06
#>   stopping criterion inner loop: 1e-09
#>   Market shares are integrated with MLHS and 50 draws.
#>   Method for standard errors: heteroskedastic

Next, all parameters that are required by the subsequent merger analysis are extracted. Note that all extracted data is based on the estimation object car_est or the data object car_data to maintain data consistency (for example, the order of data in product_data_cars might differ from car_data). Moreover, mean utilities are updated in car_data by the values in the estimation object car_est.

## Pre-Merger data
own_pre <- as.matrix(car_data$data$additional_data[, paste0("company", 1:26)])
delta_pre <- car_est$delta theta1_price <- car_est$theta_lin["price", ]
theta2_price <- car_est$theta_rc["unobs_sd*price"] theta2_all <- matrix(car_est$theta_rc)
rownames(theta2_all) <- c("price", "const", "hpwt", "air", "mpg")
colnames(theta2_all) <- "unobs_sd"

## update mean utility in data ( always use update_BLP_data() to update data object to maintain consistent data )
delta_data <- data.frame(
"id" = car_data$parameters$product_id,
"cdid" = car_data$parameters$market_id,
"delta" = delta_pre
)
car_data_updated <- update_BLP_data(
data_update = delta_data,
blp_data = car_data
)
#> Mean utility variable delta has been updated.

In the next step, an estimate for marginal costs $$mc$$ before the merger is computed. The following is based on the FOC of a Bertrand equilibrium with prices $$p$$ before the merger:

$p^{pre} - \widehat{mc} = \Omega^{pre}(p^{pre})^{-1} \hat{s}(p^{pre})$ $$\Omega^{pre}(p^{pre})^{-1}$$ is defined marketwise as the inverse of

$\Omega^{pre}(p^{pre}) = \pmatrix{ -\frac{\partial s_{1}}{\partial p_{1}} (p^{pre}) \cdot D_{1,1} & -\frac{\partial s_{2}}{\partial p_{1}} (p^{pre}) \cdot D_{1,2} & \cdots & -\frac{\partial s_{j}}{\partial p_{1}} (p^{pre}) \cdot D_{1,j} & \cdots & -\frac{\partial s_{J}}{\partial p_{1}} (p^{pre}) \cdot D_{1,J}\\ -\frac{\partial s_{1}}{\partial p_{2}} (p^{pre}) \cdot D_{2,1} & -\frac{\partial s_{2}}{\partial p_{2}} (p^{pre}) \cdot D_{2,2} & \cdots & -\frac{\partial s_{j}}{\partial p_{2}} (p^{pre}) \cdot D_{2,j} & \cdots & -\frac{\partial s_{J}}{\partial p_{2}} (p^{pre}) \cdot D_{2,J}\\ \vdots & \vdots & \ddots & \vdots & \ddots & \vdots\\ -\frac{\partial s_{1}}{\partial p_{k}} (p^{pre}) \cdot D_{k,1} & -\frac{\partial s_{2}}{\partial p_{k}} (p^{pre}) \cdot D_{k,2} & \cdots & -\frac{\partial s_{j}}{\partial p_{k}} (p^{pre}) \cdot D_{k,j} & \cdots & -\frac{\partial s_{J}}{\partial p_{k}} (p^{pre}) \cdot D_{k,J}\\ \vdots & \vdots & \ddots & \vdots & \ddots & \vdots\\ -\frac{\partial s_{1}}{\partial p_{J}} (p^{pre}) \cdot D_{J,1} & -\frac{\partial s_{2}}{\partial p_{J}} (p^{pre}) \cdot D_{J,2} & \cdots & -\frac{\partial s_{j}}{\partial p_{J}} (p^{pre}) \cdot D_{J,j}& \cdots & -\frac{\partial s_{J}}{\partial p_{J}} (p^{pre}) \cdot D_{J,J}\\ }$

with $D_{k,j} = \begin{cases} 1 & \text{if products k and j are produced by the same firm} \\ 0 & \text{otherwise} \\ \end{cases}$

Partial derivatives $$\frac{\partial s_j}{\partial p_k}$$ can be calculated based on the elasticity $$\eta_{jk} = \frac{\partial s_j }{\partial p_k }\frac{ p_k}{ s_j}$$, so $\frac{\partial s_j}{\partial p_k} = \eta_{jk} \cdot \frac{ s_j}{ p_k}$

In the following code chunk, these objects in a market i are labeled as follows:

• own_prod_pre_i ($$D_{k,j}$$)
• elasticities_i ($$\eta_{jk}$$)
• derivatives_i ($$\eta_{jk} \cdot \frac{ s_j}{ p_k}$$)
• -solve(t(derivatives_i) * own_prod_pre_i) ($$\Omega^{pre}(p^{pre})^{-1}$$)
• shareObj$shares ($$\hat{s}(p^{pre})$$). ## calculate sij shareObj <- getShareInfo( blp_data = car_data_updated, par_theta2 = theta2_all, printLevel = 0 ) ## computation of marginal costs market_id <- car_data$parameters$market_id nmkt <- length(unique(market_id)) markups <- numeric(length(market_id)) sh <- shareObj$shares
prices_pre <- car_data$data$X_rand[, "price"]

for (i in 1:nmkt) {
mkt_ind <- market_id == i
share_i <- sh[ mkt_ind ]
price_pre_i <- prices_pre[ mkt_ind ]
scalar_i <- matrix(1 / share_i) %*% matrix(price_pre_i, nrow = 1)
elasticities_i <- get_elasticities(
blp_data = car_data_updated,
share_info = shareObj,
theta_lin = theta1_price,
variable = "price",
market = i,
printLevel = 0
)

derivatives_i <- elasticities_i / scalar_i # partial derivatives of shares wrt price
own_pre_i <- own_pre[ mkt_ind, ]
own_prod_pre_i <- own_pre_i %*% t(own_pre_i) # if element (i,j) equals 1, that means that prod i and j are produced by same firm
markups[mkt_ind] <- c(-solve(t(derivatives_i) * own_prod_pre_i) %*% share_i)
}
marg_cost <- prices_pre - markups

The ownership matrix is adjusted to implement a hypothetical merger between Chrysler and GM:

# Merger between company 16 and 19 (i.e. GM and Chrysler)
prices_post <- numeric(2217)
own_post <- cbind(
own_pre[, 1:15],
own_pre[, 16] + own_pre[, 19],
own_pre[, 17:18],
own_pre[, 20:26]
)

To analyze the effect on prices the FOC of the new equilibrium must be solved: $p^{post} - \widehat{mc} = \Omega^{post}(p^{post})^{-1} \hat{s}(p^{post})$

The solution of this set of non-linear equations is obtained by the function foc_bertrand_mkt and the package nleqslv:

foc_bertrand_mkt <- function(par, own_prod, blp_data, mkt, marg_cost, theta_lin, theta_rc) {
# argument par: candidate for post merger prices
# arguments own_prod, blp_data, mkt, marg_cost, theta_lin, theta_rc: see previous code blocks

# post merger updates: update the BLP_data object for market i
tmp <- data.frame(
"id" = blp_data$parameters$product_id,
"cdid" = blp_data$parameters$market_id,
"delta" = blp_data$data$delta,
"price" = blp_data$data$X_rand[, "price"]
)

market_ind <- blp_data$parameters$market_id == mkt
delta_old <- blp_data$data$delta
prices_pre <- blp_data$data$X_rand[, "price"]
tmp$price[ market_ind ] <- par tmp$delta[ market_ind ] <- delta_old[market_ind] - prices_pre[market_ind] * theta_lin + par * theta_lin

new_blp_data <- update_BLP_data(
blp_data = blp_data,
data_update = tmp
)

ShareObj <- getShareInfo(
blp_data = new_blp_data,
par_theta2 = theta_rc,
printLevel = 0
)

implied_shares <- as.matrix(ShareObj$shares[market_ind]) elasticities_post_mkt <- get_elasticities( blp_data = new_blp_data, share_info = ShareObj, theta_lin = theta_lin, variable = "price", market = mkt, printLevel = 0 ) scalar_mkt <- matrix(1 / implied_shares) %*% matrix(par, nrow = 1) derivatives_mkt <- elasticities_post_mkt / scalar_mkt markups_post <- c(-solve(t(derivatives_mkt) * own_prod) %*% implied_shares) differences <- par - marg_cost[market_ind] - markups_post return(differences) } Finally, the function is used to compute the new equilibrium: library(nleqslv) # to solve non linear first order conditions for (i in 1:nmkt) { mkt_ind <- market_id == i own_post_i <- own_post[ mkt_ind, ] own_prod_post_i <- own_post_i %*% t(own_post_i) price_pre_i <- prices_pre[ mkt_ind ] solution <- nleqslv( x = price_pre_i, foc_bertrand_mkt, # startingguesses: price_pre_i own_prod = own_prod_post_i, blp_data = car_data_updated, mkt = i, marg_cost = marg_cost, theta_lin = theta1_price, theta_rc = theta2_all ) prices_post[ market_id == i ] <- solution$x
}