Study on Likelihood-Ratio-Based Multivariate EWMA Control Chart Using Lasso

Purpose: When applying exponentially weighted moving average (EWMA) multivariate control charts to multivariate statistical process control, in many cases, only some elements of the controlled parameters change. In such situations, control charts applying Lasso are useful. This study proposes a novel multivariate control chart that assumes that only a few elements of the controlled parameters change. Methodology/Approach: We applied Lasso to the conventional likelihood ratiobased EWMA chart; specifically, we considered a multivariate control chart based on a log-likelihood ratio with sparse estimators of the mean vector and variance-covariance matrix. Findings: The results show that 1) it is possible to identify which elements have changed by confirming each sparse estimated parameter, and 2) the proposed procedure outperforms the conventional likelihood ratio-based EWMA chart regardless of the number of parameter elements that change. Research Limitation/Implication: We perform sparse estimation under the assumption that the regularization parameters are known. However, the regularization parameters are often unknown in real life; therefore, it is necessary to discuss how to determine them. Originality/Value of paper: The study provides a natural extension of the conventional likelihood ratio-based EWMA chart to improve interpretability and detection accuracy. Our procedure is expected to solve challenges created by changes in a few elements of the population mean vector and population variance-covariance matrix. Category: Research paper QUALITY INNOVATION PROSPERITY / KVALITA INOVÁCIA PROSPERITA 25/1 – 2021 ISSN 1335-1745 (print) ISSN 1338-984X (online) 4


INTRODUCTION
Exponentially weighted moving average (EWMA) multivariate control charts are used when small changes occur continuously. Many applications of EWMA multivariate control charts have been applied to various fields such as quality control and the healthcare sector (e.g., Morton et al., 2001;Yu et al., 2011). In the case of multivariate control charts, Hotelling' s control chart (Hotelling, 1947) for detecting changes in the mean vector was originally used. However, the control chart does not always detect small changes, so research regarding multivariate EWMA control charts to constantly detect small changes, has become popular. For example, Lowly et al. (1992) proposed a multivariate EWMA (MEWMA) control chart for detecting changes in the mean vector, and Hawkins and Maboudou-Tchao (2008) proposed a multivariate exponentially weighted moving covariance matrix (MEWMC) control chart for detecting changes in the variance-covariance matrix. Additionally, to identify changes in both the mean vector and the variance-covariance matrix, Zhang, Li and Wang (2010) proposed a control chart based on the likelihood ratio, using exponential weighted moving average estimators of the mean vector and covariance matrix, termed the ELR control chart.
Conversely, when applying EWMA multivariate control charts to multivariate statistical process control, in many cases only some of the elements of the controlled parameters change. As a result, various multivariate control charts based on Lasso (Tibishirani, 1996) have been proposed for detecting changes quickly in similar situations. For example, the Lasso-based multivariate EWMA (LEWMA) control chart (Zou and Qiu, 2009), which assumes that a few elements of the mean vector change, and the Lasso multivariate EWMC (LEWMC) control chart (Maboudou-Tchao and Diawara, 2013), which supposes that a few elements of the variance-covariance matrix change, apply Lasso to MEWMA and MEWMC control charts, respectively.
The purpose of this study is to propose a novel multivariate control chart, which assumes that a few elements of the mean vector and the variance-covariance matrix change. Specifically, we aim to achieve this by applying the Lasso to the ELR control chart. The application of Lasso not only improves the accuracy of change detection, but also the ease of interpretation in identifying the variables that affect the change. Subsequently, we evaluate the performance of the proposed procedure by using real data analysis and Monte Carlo simulation.
The remainder of this paper is structured as follows. In Section 2, we describe the ELR control chart according to the existing studies and present an overview of the analysis. In Section 3, we propose a novel multivariate control chart by applying the Lasso to the ELR control chart. In Section 4, we evaluate the proposed multivariate control chart, mainly in terms of interpretability, using real data analysis. In Section 5, we evaluate the proposed multivariate control chart in terms of change detection accuracy by applying Monte Carlo simulation. Finally, in Section 6, we present the conclusions and scope for future research.

EXISTING RESEARCH
In this section, we discuss the ELR control chart (Zhang, Li and Wang, 2010) as per existing research and describe the analysis process.
As described in Section 1, the ELR control chart is a type of EWMA multivariate control chart. In the case of multivariate control charts, a situation in which a controlled process is stable is referred to as "in control". Contrarily, a situation in which a controlled process is unstable is termed "out of control". In general, the purpose of EWMA multivariate control charts is to determine whether a controlled process is in or out of control, whenever new data is observed.
When the process is in control in the ELR control chart, it is assumed that the p-dimensional vector x follows the p-dimensional multivariate normal distribution , , where and refer to the population mean vector and the population variance-covariance matrix, respectively. As a result, it is assumed that both parameters are known. Therefore, using the matrix so that = and = − , we obtain ~ , if a process is in control, where is the p-dimensional zero vector and is the p-dimensional identity matrix. However, when the process is out of control, follows the general p-dimensional multivariate normal distribution , , where, ≠ or ≠ .
Under these assumptions in the ELR control chart, we determine whether a process is in control or out of control at time = 1, 2, ⋯ . Then, if is the p-dimensional observable at time = 1, 2, ⋯ , this problem can be formulated in the framework of statistical hypothesis testing: "null hypothesis: = and = " versus "alternative hypothesis: ≠ or ≠ ." Therefore, if the null hypothesis is accepted, the process is estimated as in control, and if the null hypothesis is rejected, it is considered out of control. The test statistics for change detection are as follows.
First, we define the sample mean vector and the sample covariance matrix, weighted according to " 0 ≤ " ≤ 1 in Equations (1) and (2), where % = , & = . In EWMA-type control charts, it is recommended that a small value of 0.1~0.2 be used as the value of " (Wang, Yeh and Li, 2014). Therefore, in this paper, the value of " is fixed at 0.1 for the subsequent real data analysis and the Monte Carlo simulation.
Next, we define a special log-likelihood function in which the sample mean vector and the sample variance-covariance matrix of the multivariate normal distribution for samples of size = 1, 2, ⋯ are replaced by % and & , respectively, as follows: Then, the ratio of Equation (3) with its constrained maximum likelihood estimator under the null hypothesis plugged in and Equation (3) with its unconstrained maximum likelihood estimator plugged in, namely: is the test statistic, where Tr • and det • denote the matrix trace and the determinant, respectively, and ‖•‖ denotes the L2 norm of the vector. Subsequently, we will refer to the test statistic in Equation (4) as the chart statistic, following the conventions of multivariate control charts.

PROPOSED PROCEDURE
In this section, we propose a novel multivariate control chart (hereinafter referred to as LELR control chart), which applies the Lasso to the ELR control chart. In the proposed procedure, we perform change detection on the p-dimensional observation variable at time = 1, 2, ⋯ , similar to the ELR control chart, based on the framework of statistical hypothesis testing: "null hypothesis: = and = " versus "alternative hypothesis: ≠ or ≠ ." However, if the null hypothesis does not hold, we assume that a few elements of the mean vector and covariance matrix changed, where the chart statistics (test statistics) for change detection are as follows.
First, similar to the ELR control chart, we define the sample mean vector and the sample covariance matrix, weighted according to " 0 ≤ " ≤ 1 in Equations (1) and (2). Next, we define the penalized log-likelihood function, which is a special loglikelihood function defined in Equation (3) plus a regularization term, as follows: where > ) and > are non-negative constants, and ? and ? are regularization terms for the parameters and , respectively. In this paper, we discuss the case in which a Lasso-type regularization term is adopted, although there are Lasso (Tibishirani, 1996), adaptive Lasso (Zou, 2006), and SCAD (Fan and Li, 2001) regularization terms for ?
As a result, the ratio of Equation (3) with its constrained maximum likelihood estimator under the null hypothesis plugged in and Equation (3) with its unconstrained maximum likelihood estimator plugged in, namely: is the chart statistic, where A and @ are the unconstrained maximum likelihood estimators of Equation (5).
Note that in Equation (5), if we choose adaptive Lasso as ? and estimate the parameters with > as zero, the chart statistic is consistent with the chart statistic in the LEWMA control chart (Zou and Qiu, 2009). Similarly, if we choose Lasso as ?
and estimate the parameters with > ) as zero, the chart statistic is consistent with the chart statistic in the LEWMC control chart (Maboudou-Tchao and Diawara, 2013). Therefore, the chart statistic of the proposed procedure is a natural extension of the chart statistic of LEWMA and LEWMC control charts in the case in which both the mean vector and the covariance matrix change. However, the optimal solution of Equation (5) cannot be obtained analytically, while it is difficult to optimize and simultaneously. For this reason, we use the following estimation algorithm to optimize and .
In our optimization algorithm, we update the optimal solutions that correspond to the estimated values of and until the values converge. That is, the update formula for A and @ with B = 1, 2, ⋯ are defined as follows: where MN O M is the absolute value of the j-th element of and ‖•‖ ) is the sum of the absolute values of each element of the matrix. The initial value of @ is & . Note that, by using an estimation algorithm such as per Wang, Yeh and Li (2014), we can apply a modified algorithm of Bien and Tibshirani (2011) to solve the Equation (8), which simplifies the implementation of the algorithm. In addition, the stopping rule is X YZ < \, where X YZ is the Kullback-Leibler divergence (Kullback and Leibler, 1951) for ] A C , @ C ^ and ] A CD) , @ CD) ^, and \ is the infinitesimal quantity.
Finally, note that if > ) , > = 0, 0 , the proposed procedure is consistent with the ELR control chart. Therefore, if we set appropriate regularization parameters, the proposed procedure achieves equal or greater accuracy than the ELR control chart. However, in this study, the optimal regularization parameter is determined as an open question, and the discussion proceeds assuming that the value is known. In addition, in subsequent numerical experiments, the optimal parameters are determined in preliminary experiments.

REAL DATA ANALYSIS
In this section, we evaluate the proposed procedure mainly in terms of interpretability, using real data analysis.

Experimental Settings
The experimental settings are as follows. The comparison target is the ELR control chart. The data used (Hawkins and Maboudou-Tchao, 2008) are four consecutive measurements of mean systolic blood pressure, mean diastolic blood pressure, mean heart rate, and overall mean arterial pressure for 24 periods. Because this data follows the change point from the first period, it is important to determine how quickly changes can be detected. Further, the population mean vector and the population variance-covariance matrix prior to the change point is known. Refer to Hawkins and Maboudou-Tchao (2008) for the details of the data.
We use the average run length (ARL), which represents the period exceeding the control limit line as an evaluation criterion, where ARL under the null hypothesis is referred to as In Control ARL (IC-ARL), and ARL under the alternative hypothesis is termed Out of Control ARL (OC-ARL). We set IC-ARL = 200 and the control limit line. In addition, we examine the change of the chart statistic in the proposed procedure and the ELR control chart, where the regularization parameters for the proposed procedure are > ) , > = 0.1, 0.1 . The value of " in Equations (1) and (2) is 0.1 for both the proposed procedure and the ELR control chart.

Experimental Results
The experimental results are presented in Figure 1, which shows the control charts of (a) the change in the chart statistic of the proposed procedure, and (b) the change in the chart statistic of the ELR control chart.
In Figure 1a, it is observed that the points of the 23rd and 24th periods exceed the control limit line in the proposed procedure. On the other hand, Figure 1b shows that the points of the 19th, 20th, 23rd, and 24th periods exceed the control limit line in the ELR control chart.
Next, we show the changes in the mean vector and the variance-covariance matrix in periods 23 and 24, which were outside the control limit line in the proposed procedure: (a) Proposed procedure (b) ELR control chart These results show that in the proposed procedure, we are able to estimate certain elements as an exact value of zero and to identify the changing variables. For example, we observe shifts in mean diastolic blood pressure, mean of heart rate and overall mean arterial pressure, and a greater variance in overall mean arterial pressure. By applying Lasso, we confirm that the proposed procedure improves interpretability while maintaining the same accuracy as the ELR control chart.

MONTE CARLO SIMULATION
In this section, we evaluate the proposed procedure mainly in terms of change detection accuracy by Monte Carlo simulation.

Simulation Settings
The simulation settings are as follows. First, the comparison target is the ELR control chart. Next, we prepare four sets of data for the different types of changes. Specifically, we generate multivariate normal random numbers (11) ) l = ) , = l m k nn = 1, 2, ⋯ , where j is the 10 − q -dimensional vector with each element of r. In addition, m is a 10 − q × 10 − q square matrix with off-diagonal elements of r, o is a 10 − q × 10 − q square matrix with diagonal elements of 1 + r, and p is a 10 − q × 10 − q square matrix with diagonal elements of 1 + r and offdiagonal elements of r. In Equations (11) and (13), we change the value of q in the range of 0 ≤ q ≤ 9, and in Equations (12) and (14), we change the value of q in the range of 0 ≤ q ≤ 8 and perform simulations. The regularization parameters of the proposed procedure are set to > ) ∈ .0, 0.01, 0.1/ and > ∈ .0, 0.01, 0.1/, while the optimal parameter is selected by preliminary experiments from nine combinations. The value of " in Equations (1) and (2) is 0.1 for both the proposed procedure and the ELR control chart.
We use ARL as an evaluation criterion, as well as real data analysis. In this simulation, we set IC-ARL = 200 and the control limit line. In addition, we evaluate the mean of OC-ARL for 5000 simulations, in which a smaller OC-ARL is preferable.

Simulation Settings
We indicate the results of this experiment by r = 0.1 in Figure 2-5. In each figure, the solid line refers to the change in OC-ARL of the proposed procedure, and the dotted line indicates the change in OC-ARL of the ELR control chart. However, for each value of q in the proposed procedure, we describe the best performing results among the nine combinations of regularization parameters. We obtained the same results for r = 0.5, 1.0.

Figure 3 -Comparison of OC-ARL for the Proposed Procedure and the ELR Control Chart (Only the Covariances of the Covariance Matrix Changes)
From Figure 2-5, it is observed that in all situations in Equations (11) -(14), the proposed procedure shows greater accuracy than the ELR control chart. Although the proposed procedure assumes that both the mean vector and the covariance matrix change, we can execute change detection without any problems, even in a situation where only one changes.
It is noted that regardless of the number of changing elements, the proposed procedure shows greater accuracy than the ELR control chart. The proposed procedure assumes a situation in which a few elements change. In the data generation method assumed in this simulation, the effect of Lasso's shrinkage estimation may have resulted in satisfactory accuracy, even in situations in which many elements change. In this simulation, the change is fixed at r for all elements, but it should be noted that if different values are set for each element, the shrinkage estimation may not function well, and the accuracy may be inadequate.

CONCLUSION
In this study, we adopted the ELR control chart, which is one of the EWMA-type multivariate control charts, and proposed a novel multivariate control chart by applying Lasso. Specifically, we suggested a multivariate control chart based on a log-likelihood ratio with sparse estimators of the mean vector and the variancecovariance matrix, assuming a situation in which a few elements of each parameter change. Subsequently, through real data analysis, we showed that it is possible to identify which elements of the mean vector and the variancecovariance matrix have changed by confirming each of the sparse estimated parameters. Additionally, through Monte Carlo simulations, we showed that the proposed procedure outperforms the ELR control chart regardless of the number of elements that change in the parameters. Additionally, we showed that the proposed procedure can be applied to situations where both the mean vector and the variance-covariance matrix change, and situations where only one of them changes.
The scope for future research is as follows. In this study, we discussed the appropriate regularization parameters. As described in Section 5, the proposed procedure indicates equal or improved accuracy compared to the ELR control chart, if appropriate regularization parameters are set. Contrarily, it is necessary to determine the regularization parameters from the observed data for practical purposes. Whether cross-validation and various information criteria can be applied to determine the regularization parameters, should also be verified. In addition, we conducted the experiment with the value of " fixed at 0.1 in Sections 4 and 5. We would like to discuss how to determine the value of ". Finally, although different statistics are used, pGLR chart (Wang, Yeh and Li, 2014) that is an application of Lasso to GLR chart, can detect the change of the mean vector and the variance-covariance matrix, as well as ELR chart. It would prove valuable to compare the proposed procedure with these control charts.