An Interactive Model of Target and Context for Aspect-Level Sentiment Classification.
The aspect-level sentiment classification is a fine-grained task in sentiment analysis . Given a review and a target occurring in the review, it aims to identify the sentiment polarity (e.g., negative, neutral, or positive) expressed on each target in its context. For example, considering this review "the voice quality of this phone is amazing, but the price is ridiculous," we observe that there are two targets ("voice quality' and "price") with completely opposite polarities. The sentiment expressed on target "voice quality" is positive, whereas the sentiment for target "price" is negative. Jiang et al.  introduced a target-dependent Twitter sentiment classifier, which showed that not considering the target information discussed in the review results in 40% of sentiment classification errors. Therefore, the task of aspect-level sentiment classification is also aimed at predicting a sentiment category for a review-target pair.
Different from sentence- and document-level sentiment analysis [3-6], for aspect-level sentiment classification, a review may contain multiple review-target pairs, and thus, separating different contexts for different targets is a challenge. Many methods based on neural networks have been proposed for aspect-level sentiment classification. For example, Dong et al.  used the adaptive recursive neural network to evaluate the sentiments of specific targets in context words. Vo and Zhang  separated the whole review into three sections (the target, its left contexts, and its right contexts) and used neural pooling functions and sentiment lexicon to extract the feature vector for a given target. Tang et al.  divided the review into the left part with the target and the right part with the target and then used two long short-term memory (LSTM) networks to encode the two parts, respectively. Zhang et al.  used a gated neural network to capture the interaction information between the target and its surrounding contexts. To further focus on important words of the sentence that modulate the sentiment of the targets, Wang et al.  introduced LSTM networks and an attention mechanism to concatenate word representations with target embeddings to generate the final sentiment representations.
Although the previous approaches have realized the importance of targets in sentiment classification, these approaches only focus on the impact of targets on context modeling. How to use the interaction information between contexts and the target phrase to separately model contexts and targets has become a new research issue. Ma et al.  proposed an interactive attention network (IAN) that uses two LSTM networks to model the contexts and target phrase, respectively, and then uses the hidden states from the contexts to generate an attention vector for the target phrase, and vice versa. Based on , Huang et al.  proposed an attention-over-attention (AOA) neural network, which models targets and reviews simultaneously using two LSTMs and then the target representation and text representation can be interacted through the AOA module. Zheng and Xia  designed a left-center-right separated neural network to model the left context, target phrase, and right context, respectively, and modeled the relation between the target and the left/right context using a rotatory attention mechanism.
To further improve the representations of targets and contexts, we propose an interactive neural network model named LT-T-TR. Firstly, it divides a review into three parts: the left context with the target phrase, the target phrase, and the right context with the target phrase. Three Bidirectional Long Short-Term Memory networks (BiLSTMs) are used to model these parts, respectively. Secondly, different words in reviews have different contributions to the final representation, and contexts and targets are influenced by each other, so attention weights of the target phrase and the left/right context are computed by interactive attention between the target phrase and the left/right context. The process is made up of two parts: the first is target-to-context attention, which includes the target-to-left context attention and the target-to-right context attention, to get better representations of the left/ right contexts; the second is context-to-target attention that includes the left context-to-target attention and the right context-to-target attention. After computing these attention weights, we get the target phrase and left/right context representations. Next, these representations are concatenated to generate the final classification vectors. Experimental results on laptop and restaurant datasets show that our method achieves obvious improvements. The main contributions of this study can be summarized as follows:
(a) Dividing a review into three parts: the left context with the target phrase, the target phrase, and the right context with the target phrase. Three BiLSTMs are used to model these parts, respectively.
(b) Computing attention weights of the left/right context and the target phrase and getting representations of the target phrase and the left/right context using attention weights.
(c) Concatenating these representations to form the final classification vectors and evaluating our model on laptop and restaurant datasets.
In this section, we first give the task definition of aspect-level sentiment classification. Afterward, we introduce the different components of our model as displayed in Figure 1.
2.1. Task Definition. Given a review S = [[w.sub.1],..., [w.sub.m], [w.sub.m+1],..., w.sub.s-1],[w.sub.s],..., [w.sub.n]] consisting of n words, [w.sub.1], [w.sub.2],..., [w.sub.m] are the preceding context words, [w.sub.m+1],[w.sub.m+2],..., [w.sub.s-1] are the target words, and [w.sub.s],[w.sub.s+1],..., [w.sub.n] are the following context words. We divide the review into three parts: the left context LT = [[w.sup.l.sub.1], [w.sup.l.sub.2],..., [w.sup.l.sub.s-1]] consisting of [w.sub.1],[w.sub.2],..., [w.sub.m] and [w.sub.m+1],[w.sub.m+2],..., [w.sub.s-1] the target phrase T = [w.sup.t.sub.m+1], [w.sup.t.sub.m+2], ..., [w.sup.t.sub.s-1]] consisting of [w.sub.m+1], [w.sub.m+2],..., [w.sub.s-1], and the right context RT = [[w.sup.r.sub.m+1], [w.sup.r.sub.m+2],..., [w.sup.r.sub.n]] consisting of [w.sub.m+1],[w.sub.m+2],..., [w.sub.s-1] and [w.sub.s], [w.sub.s+1],..., [w.sub.n]. Aspect-level sentiment classification aims at determining the sentiment polarity of review S toward target T. For example, the sentiment polarity of review "the voice quality of this phone is amazing, but the price is ridiculous" toward target "voice quality" is positive, but the polarity toward target "price" is negative.
2.2. Bi-LSTMs. First, we represent each word in S as word embedding  and get word vectors [mathematical expression not reproducible], and RT, where d is the embedding dimension. Then, we feed these three-part word vectors to three Bi-LSTMs , respectively, to learn the hidden word semantics. Each BiLSTM is obtained by stacking a forward LSTM and a backward LSTM, which are good at learning long-term dependencies . In the LSTM architecture, there are three gates (input gate, forget gate, and output gate) and a cell memory state. Each cell can be updated as follows:
[mathematical expression not reproducible], (1)
where [sigma] is the sigmoid function, 0 denotes elementwise multiplication, and * stands for matrix multiplication; W and b denote the weight matrices and biases, respectively; [v.sub.k] is the input word vector, and [h.sub.k-1] is the previous hidden state.
For the left context LT, the input of Bi-LSTM is [[v.sup.l.sub.1], [v.sup.l.sub.2],..., [v.sup.l.sub.s-1] [member of] [R.sup.(s-1)xd] and we get hidden states as follows:
[[h.sup.l.sub.1], [h.sup.l.sub.2],..., [h.sup.l.sub.s-1]] = Bi - LSTM([[v.sup.l.sub.1], [v.sup.l.sub.2],..., [v.sup.l.sub.s-1]]), (2)
where the output [h.sup.l.sub.k] (k = 1,..., s - 1) is obtained by concatenating the corresponding states of the forward and backward LSTM. Similarly, we can get the hidden semantic states [[h.sup.t.sub.m+1], [h.sup.t.sub.m+2],..., [h.sup.t.sub.s-1]] for target T and the hidden states [[h.sup.r.sub.m+1], [h.sup.r.sub.m+2],..., [h.sup.r.sub.n]] for the right context RT in the same way.
Then, through an average pooling operation, we can obtain the initial representations of LT, T, and RT as follows:
L[T.sub.initial] = [s-1.summation over (k=1)] [h.sup.l.sub.k]/s - 1, (3)
[T.sub.initial] = [s-1.summation over (k=m+1)] [h.sup.t.sub.k]/s - m - 1, (4)
R[R.sub.initial] = [n.summation over (k=m+1)] [h.sup.r.sub.k]/n - m, (5)
2.3. Attention Layer. After getting the hidden representations of the context and the target phrase generated by three Bi-LSTMs, we use the attention mechanism to calculate the different importance of words in the left/right context and the target phrase.
2.31. Target-to-Context Attention. Given the hidden representations of the left context [[h.sup.l.sub.1], [h.sup.l.sub.2], ..., [h.sup.l.sub.s-1] and the average representation of target [T.sub.initial], we first get the target-to-left context attention representation L[T.sub.final] by
L[T.sub.final] = [s-1.summation over (k=1)] [[alpha].sup.l.sub.k][h.sup.l.sub.k], (6)
where [[alpha].sup.l.sub.k] is the weight of [h.sup.l.sub.k] that we can obtain from a softmax function:
[[alpha].sup.l.sub.k] = exp([f.sub.att]([h.sup.l.sub.k], [T.sub.initial]))/ [[summation].sup.s-1.sub.j=1] exp([f.sub.att]([h.sup.l.sub.j],[T.sub.initial])). (7)
Here, [f.sub.att] is a score function that indicates the importance of words in the left context influenced by the target:
[f.sub.att]([h.sup.l.sub.k], [T.sub.initial]) = tanh([h.sup.l.sub.k] x [W.sub.a] x [T.sup.T.sub.initial] + [b.sub.a]), (8)
where tanh is a nonlinear function, [W.sub.a] is the weight matrix, [b.sub.a] is the bias, and [T.sup.T.sub.initial] is the transpose of [T.sub.initial].
Similar to equations (6)-(8), we can also obtain the target-to-right context attention representation T[R.sub.final] using the average representation of the target [T.sub.initial].
2.3.2. Context-to-Target Attention. For the hidden representations of target [[h.sup.t.sub.m+1], [h.sup.t.sub.m+2],..., [h.sup.t.sub.s-1]], we first compute the weight representations as follows:
[[alpha].sup.lt.sub.k] = soft max([f.sub.att]([h.sup.t.sub.k], L[T.sub.initial])), (9)
[f.sub.att]([h.sup.t.sub.k], L[T.sub.initial]) = tanh([h.sup.t.sub.k] x [W.sub.L] x L[T.sup.T.sub.initial] + [b.sub.L]), (10)
where [W.sub.L] and [b.sub.L] are the weight matrix and bias, respectively.
Then, through calculating the weighted combination of the hidden states of the target phrase, we can obtain the left context-to-target representation as follows:
[T.sup.lt.sub.final] = [s-1.summation over (k=m+1)] [[alpha].sup.lt.sub.k][h.sup.t.sub.k]. (11)
Similar to equations (9)-(11), we can obtain the right context-to-target representation [T.sup.rt.sub.final] by using T[R.sub.initial] and the hidden representations of the target.
After getting [T.sup.lt.sub.final] and [T.sup.rt.sub.final], we get the final representation of the target phrase through concatenating [T.sup.lt.sub.final] and [T.sup.rt.sub.final]:
[T.sub.final] - [[T.sup.lt.sub.final]; [T.sup.rt.sub.final]]]. (12)
2.4. Final Classification. Then, we concatenate L[T.sub.final], [T.sub.final], and T[R.sub.final] as the final representation of review S:
v - [L[T.sub.final]; [T.sub.final]; T[R.sub.final]]. (13)
We project v into the space of targeted C classes through a nonlinear function:
x - tanh ([W.sub.v] x + [b.sub.v]), (14)
where [W.sub.v] and [b.sub.v] are the parameters. Finally, the sentiment polarity of the review S with sentiment polarity c e C toward a target T is calculated as follows:
P(y = c) = exp([x.sub.c])/[[summation].sub.i[member of]C] exp ([x.sub.i]). (15)
2.5. Model Training. The model is trained in an end-to-end way. The loss function is the crossentropy error:
loss = -[summation over ((S,T)[member of]D)] [summation over (c[member of]C)] g([y.sub.(S,T)] = c) x log(P([y.sub.(S,T)] = c)), (16)
where D means all training data, (S,T) means a review-target pair, C is the number of categories of sentiment, P([y.sub.(S,T)] - c) is the probability of predicting (S,T) as class c given by the softmax function, and g([y.sub.(S,T)] = c) shows whether class c is the correct sentiment category.
3.1. Experimental Settings
3.1.1. Datasets. We conduct our experiments using the dataset for SemEval 2014 Task 4 . This dataset contains customer reviews on restaurants and laptops. Each review has one or more targets with their corresponding polarities. The polarity of targets can be positive, negative, neutral, or conflict. However, we only consider the first three labels for classification. The statistics of the datasets are shown in Table l.
3.1.2. Parameters and Evaluation Metric. In our experiments, the dimensions of word embeddings, attention vectors, and LSTM hidden states are set to 300. All word embeddings are initialized by GloVe , and we randomly initialize the out-of-vocabulary words from uniform distribution U(-0.1,0.1). All weight matrices are randomly initialized from uniform distribution U(-0.1,0.1), and all bias terms are set to zero. The dropout rate is set to 0.5.
We adopt the accuracy to evaluate the performance of our model, which is defined as follows:
Acc = T/N, (17)
where T is the number of correctly predicted samples and N is the total number of samples.
3.2. Model Comparisons. We compare our model with some baseline approaches:
Majority: the largest sentiment polarity in the training set is regarded as the classification result of each sample in the test set.
LSTM: a standard LSTM which models the review as a whole and uses the last hidden state of LSTM as the final revive representation .
TD-LSTM: TD-LSTM obtains the final sentiment representation by concatenating two LSTM networks which model the preceding and following contexts surrounding the target, respectively .
AE-LSTM: AE-LSTM concatenated the target vector with each word in review as the input of LSTM . ATAE-LSTM: ATAE-LSTM appends the aspect embedding into each word vector to strengthen the importance of the target .
IAN: two LSTM networks are used to model the review and target phrase, respectively. It uses the hidden states of the review to generate an attention vector for the aspect, and vice versa. Based on these two attention vectors, it outputs a review representation and an aspect representation for classification .
The experimental results are shown in Table 2.
First, the worst method is Majority, demonstrating that for aspect-level sentiment classification, a powerful feature representation is important. Then, among all the other methods based on LSTM, the basic LSTM approach has the worst performance because it just models the whole review and ignores the target information. TD-LSTM has an improvement of 1% on the restaurant dataset and 2% on the laptop dataset over LSTM when target information is taken into consideration. Because the attention mechanism is introduced, AE-LSTM and ATAE-LSTM perform better than TD-LSTM. IAN obtains better results on restaurant and laptop datasets than LSTM-based methods because IAN explores separate representations of targets and interactive learning between the context and target. Our LT-T-TR model significantly surpasses the performance of IAN and all other baseline approaches. This reinforces our hypothesis that a model capable of capturing target-context dependencies interactively indeed performs better. We will conduct a more detailed analysis in the following sections.
3.3. Model Analysis: The Effect of Different Pooling Functions. In this section, we analyze the contribution of various pooling functions (see equations (3)-(5)) by using the LT-T-TR model. The results are shown in Table 3. It can be seen that the accuracy (77.5%) is the lowest when using min pooling alone to extract hidden features. By using max and avg pooling, the model has a significantly improved accuracy (79.3% and 79.6%, respectively). Finally, we obtain the best accuracy (80.6%) by combining max and avg pooling.
3.4. Model Analysis: The Effect of Different Sequence Models. We analyze the effect of different sequence models, recurrent neural networks (RNN), LSTM, and gated recurrent unit (GRU), to verify the effectiveness of our model. The results of experimental comparison results are shown in Table 4. We can see that LSTM performs better than RNN, and this is because LSTM has more complicated hidden units and offer better computation capability than RNN. Simultaneously, GRU has fewer parameters to train compared to LSTM, so that GRU has better accuracy than LSTM. Bi-LSTM has slightly better performance compared to GRU and LSTM because Bi-LSTM can capture more context semantic information than LSTM and GRU.
3.5. Model Analysis. To validate the effectiveness of the LTT-TR model, we design several models in this section. We first input the review as a whole (rather than as three segments) into Bi-LSTM for modeling, and then use the attention mechanism to calculate the importance of each word toward sentiment categories. We refer to this model as No-Separation. Second, we simplify the LT-T-TR model by using the average of initial target vectors to represent the target phrase, we refer to this model as No-Target-Learned.
Furthermore, we compare the effect of interactive attention modeling between the target and left/right context. First, we build a model (named No-Interaction) without interactive information by removing the attention interaction operation between the left/right context and target phrase and just learn the attention weight representation by their own Bi-LSTM hidden states. Then, we build the Target-to-Context model by removing context-to-target attention, which is based on Target-to-Context . Finally, we create an L-T-R model through dividing a review into the preceding context (without target), the target, and the following context (without target) and then model these three parts in the same way as in the LT-T-TR model.
Table 5 shows the experimental results. It can see that the No-Separation model achieves the worst performance among all approaches, and the No-Target-Learned model performs worse than No-Interaction and Target-to-Context model. This verifies that the target representation is important to judge the final sentiment categories, and the target should be modeled separately.
And L-T-R and LT-T-TR perform better than the No-Interaction model and the Target-to-Context model, which shows that the interaction between the target phrase and left/ right context is important to final sentiment classification. Moreover, L-T-R has slightly worse results than the LT-T-TR model because the target phrase is not contained in the left/right context.
3.6. Qualitative Analysis. In this section, we selected three examples from the restaurants dataset to analyze which words contribute the most to the final classification. We get the attention weights and then visualize them by using a visualization tool Heml . The results are shown in Figure 2, in which the color depth represents the importance of a word: the darker, the more important.
The review in Figures 2(a) and 2(b) is "The people with carts of food don't understand you because they don't speak English, their job is to give you the delicious food you point at." The corresponding targets are "food' and "people with carts of food," respectively. It can be seen that when a review contains two targets, the correct sentiment categories for each target can be calculated automatically through our model, that is, the attention mechanism can dynamically obtain the important words from the whole review. In Figure 2(b), we can see that "people" is the most important word in the target phrase "people with carts of food." In Figure 2(c), the target is a multiword phrase "fried mini buns with the condensed milk and the assorted fruits on beancurd" and "buns" and" "fruits" are more important than other words, so our model pays more attention to "buns" and "fruits." This also proves that just averaging the vectors of words the target phrase contains to represent the target does not help much. Therefore, modeling the target phrase and context interactively is important for aspect-level sentiment classification.
3.7. Error Analysis. We made an error analysis of the experimental results. The first type of error is caused by non-compositional sentiment expression . For instance, in this review "not only was the look of the food fabulous, but also the taste was to die for," "taste" is a target and "to die for" is the relevant sentiment expression, whose meaning should not be understood literally. The second kind of error comes from complex sentimental relation expressions such as double negatives, assumptions, and comparisons, like "even though the price of this camera is unacceptable, I love its lens." Our model fails to deal with the complex sentiment expression in this case. Furthermore, in the review "the movie was really on point--I was surprised," "movie" is the target word and the idiom "on point" is the relevant sentiment expression, which is difficult to be identified by our model.
4. Related Work
4.1. Aspect-Level Sentiment Classification. Sentiment analysis, also known as opinion mining [1, 21], has brought the widespread attention from both industry and academic communities. As a fine-grained task in the field of sentiment analysis , aspect-level sentiment classification has drawn a lot of attention, which is also considered as a kind of text classification problem. Traditional text classification methods depend greatly on the effectiveness of the feature engineering , which lacks generalization and is difficult for us to discover the potential explanatory or discriminative factors of data. In recent years, distributed word representation and neural network methods have been proposed and shown promising performance on this task [7, 8]. Dong et al.  used an adaptive recursive neural network to evaluate the sentiments of specific targets in context words. Vo and Zhang  separated the whole review into three sections (the target, its left contexts, and its right contexts) and used neural pooling functions and sentiment lexicon to extract the feature vector for a given target.
4.2. Neural Network for Aspect-Level Sentiment Classification. Today, neural network approaches are extremely fashionable for many natural language processing tasks and obviously, the field of sentiment classification is no exception. Many sentence/document-level sentiment classification tasks are dominated by neural network architectures [23-25]. To further incorporate context information with target information, several models have been proposed, such as the target-dependent LSTM , which models each sentence toward the aspect. ATAELSTM and AT-LSTM  are attentional models inspired by . AT-LSTM can be considered as a modification of the neural attention proposed in  for entailment detection, swapping the premise's last hidden state for the aspect embedding. Han et al.  proposed a novel neural network based on LSTM and the attention mechanism for word context extraction and document representation. Chen et al.  combined regional long short-term memory and convolutional neural network for target-based sentiment classification. Zhang et al.  introduced dynamic memory networks based on multiple attention mechanism and LSTM, which showed a significant performance in aspect-level sentiment classification. Yang et al.  designed a coattention-LSTM network based on coattention mechanism for aspect-based sentiment analysis, combining the target and context attention vectors of sentences. The work most relevant to ours is IAN , which models the sentence and aspect term using two LSTM networks, respectively. It uses the hidden states from the sentence to generate an attention vector for the aspect, and vice versa. Based on these two attention vectors, it outputs a sentence representation and an aspect representation for classification.
Despite these aforementioned methods are effective, discriminating different sentiment polarities for different targets is still a challenging issue. Therefore, it is necessary to design a powerful neural network for aspect-level sentiment classification.
In this study, we have proposed an interactive neural network for aspect-level sentiment classification. The approach uses Bi-LSTM and an attention mechanism to interactively learn the important words in the target and context and generates the review representation for the final sentiment classification. Experimental results on the SemEval 2014 dataset show that our method achieves significant improvements. Our model analysis also shows that different sequence models can discriminatively learn the important words in the context and in the target. Furthermore, our model cannot handle several error cases effectively.
The data used to support the findings of this study are included within the article.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This work was partially supported by the National Social Science Foundation of China under grant no. 17BXW071, the National Natural Science Foundation of China under grant no. 61562057, and the Technology Program of Gansu Science and Technology Department under grant no. 18JR3RA104.
 B. Liu, "Sentiment analysis and opinion mining," Synthesis Lectures on Human Language Technologies, vol. 5, no. 1, pp. 1-167, 2012.
 L. Jiang, M. Yu, M. Zhou, X. Liu, and T. Zhao, "Target-dependent twitter sentiment classification," in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 151-160, Portland, OR, USA, June 2011.
 K. Yoon, "Convolutional neural networks for sentence classification," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746-1751, Doha, Qatar, October 2014.
 X. Zhang, J. Zhao, and Y. LeCun, "Character-level convolutional networks for text classification," 2015, https://arxiv. org/abs/1509.01626.
 R. Johnson and T. Zhang, "Effective use of word order for text categorization with convolutional neural networks," 2014, https://arxiv.org/abs/1412.1058.
 D. Tang, B. Qin, and T. Liu, "Document modeling with gated recurrent neural network for sentiment classification," in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1422-1432, Lisbon, Portugal, September 2015.
 L. Dong, F. Wei, C. Tan, D. Tang, M. Zhou, and K. Xu, "Adaptive recursive neural network for target-dependent twitter sentiment classification," in Proceedings of the 52th Annual Meeting of the Association for Computational Linguistics, pp. 49-54, Baltimore, MD, USA, June 2014.
 D.-T. Vo and Y. Zhang, "Target dependent twitter sentiment classification with rich automatic features," in Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI), pp. 1347-1353, Buenos Aires, Argentina, July 2015.
 D. Tang, B. Qin, X. Feng, and T. Liu, "Effective LSTMs for target-dependent sentiment classification," in Proceedings of the International Conference on Computational Linguistics, pp. 3298-3307, Osaka, Japan, December 2016.
 M. Zhang, Y. Zhang, and D.-T. Vo, "Gated neural networks for targeted sentiment analysis," in Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 3087-3093, Phoenix, AZ, USA, February 2016.
 Y. Wang, M. Huang, X. Zhu, and L. Zhao, "Attention-based LSTM for aspect-level sentiment classification," in Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 606-615, Austin, TX, USA, November 2016.
 D. Ma, S. Li, X. Zhang, and H. Wang, "Interactive attention networks for aspect-level sentiment classification," in Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, pp. 4068-4074, Melbourne, Australia, August 2017.
 B. Huang, Y. Ou, and K. M. Carley, "Aspect level sentiment classification with attention-over-attention neural networks," in Proceedings of the 11th International Conference on Social, Cultural, and Behavioral Modeling, SBP-BRiMS 2018, pp. 197-206, Washington, DC, USA, July 2018.
 S. Zheng and R. Xia, "Left-center-right separated neural network for aspect-based sentiment analysis with rotatory attention," https://arxiv.org/abs/1802.00892.
 Y. Bengio, R. Ducharme, P. Vincent et al., "A neural probabilistic language model," Journal of Machine Learning Research, vol. 3, no. 6, pp. 1137-1155, 2003.
 A. Graves, A.-R. Mohamed, and G. Hinton, "Speech recognition with deep recurrent neural networks," in Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), no. 3, pp. 6645-6649, Vancouver, Canada, May 2013.
 S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
 M. Pontiki, D. Galanis, J. Pavlopoulos, H. Papageorgiou, I. Androutsopoulos, and S. Manandhar, "SemEval-2014 task 4: aspect based sentiment analysis," in Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pp. 27-35, Dublin, Ireland, August 2014.
 J. Pennington, R. Socher, and C. D. Manning, "Glove: global vectors for word representation," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532-1543, Doha, Qatar, October 2014.
 D. Tang, B. Qin, and T. Liu, "Aspect level sentiment classification with deep memory network," in Proceedings of the Conference on empirical Methods in Natural Language Processing, pp. 214-224, Austin, TX, USA, November 2016.
 B. Pang and L. Lee, "Opinion mining and sentiment analysis," Foundations and Trends in Information Retrieval, vol. 2, no. 12, pp. 1-135, 2008.
 S. Kiritchenko, X. Zhu, C. Cherry, and S. Mohammad, "NRC-Canada-2014: detecting aspects and sentiment in customer reviews," in Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pp. 437-442, Dublin, Ireland, August 2014.
 J. Bradbury, S. Merity, C. Xiong, and R. Socher, "Quasi-recurrent neural networks," 2016, https://arxiv.org/abs/1611.01576.
 Q. Qiao, M. Huang, J. Lei, and X. Zhu, "Linguistically regularized LSTM for sentiment classification," in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, pp. 1679-1689, Vancouver, Canada, July 2017.
 K. S. Tai, R. Socher, and C. D. Manning, "Improved semantic representations from tree-structured long short-term memory networks," 2015, https://arxiv.org/abs/1503.00075.
 T. Rocktaschel, E. Grefenstette, K. M. Hermann, T. Kocisky, and P. Blunsom, "Reasoning about entailment with neural attention," 2015, https://arxiv.org/abs/1509.06664.
 H. Han, X. Bai, and P. Li, "Augmented sentiment representation by learning context information," Neural Computing and Applications, vol. 31, no. 12, pp. 8475-8482, 2018.
 S. Chen, C. Peng, L. Cai, and L. Guo, "A deep neural network model for target-based sentiment analysis," in Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, July 2018.
 Z. Zhang, L. Wang, Y. Zou, and C. Gan, "The optimally designed dynamic memory networks for targeted sentiment classification," Neurocomputing, vol. 309, pp. 36-45, 2018.
 C. Yang, H. Zhang, B. Jiang, and K. Li, "Aspect-based sentiment analysis with alternating coattention networks," Information Processing & Management, vol. 56, no. 3, pp. 463-478, 2019.
Hu Han, (1,2) Guoli Liu [ID], (1) and Jianwu Dang [ID] (1,2)
(1) School of Electronic & Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
(2) Gansu Provincial Engineering Research Center for Artificial Intelligence and Graphic & Image Processing, Lanzhou 730070, China
Correspondence should be addressed to Guoli Liu; firstname.lastname@example.org
Received 16 September 2019; Revised 15 November 2019; Accepted 2 December 2019; Published 19 December 2019
Academic Editor: Maciej Lawrynczuk
Caption: Figure 1: The overall architecture of our aspect-level sentiment classification model.
Caption: Figure 2: Attention visualizations.
Table 1: The statistics of the datasets. Dataset Positive Neutral Negative Laptop-train 994 464 870 Laptop-test 341 169 128 Restaurant-train 2164 637 807 Restaurant-test 728 196 196 Table 2: Comparison results. Accuracies for three-way classification on the restaurant and laptop datasets. Method Restaurant Laptop Majority 0.535 0.650 LSTM 0.743 0.665 TD-LSTM 0.756 0.681 AE-LSTM 0.762 0.689 ATAE-LSTM 0.772 0.687 IAN 0.786 0.721 LT-T-TR 0.806 0.743 Table 3: The effect of different pooling functions. Pooling function Restaurant Min 0.775 Max 0.793 Avg 0.796 Max + avg 0.806 Table 4: Comparison results. Method Restaurant Laptop LT-T-TR (RNN) 0.776 0.690 LT-T-TR (LSTM) 0.788 0.726 LT-T-TR (GRU) 0.790 0.730 LT-T-TR (Bi-LSTM) 0.806 0.743 Table 5: Analysis of our LT-T-TR model. Method Restaurant Laptop No-Separation 0.758 0.684 No-Target-Learned 0.760 0.707 No-Interaction 0.776 0.713 Target-to-Context 0.785 0.722 L-T-R 0.795 0.730 LT-T-TR 0.806 0.743
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Research Article|
|Author:||Han, Hu; Liu, Guoli; Dang, Jianwu|
|Publication:||Computational Intelligence and Neuroscience|
|Date:||Dec 1, 2019|
|Previous Article:||Channel Projection-Based CCA Target Identification Method for an SSVEP-Based BCI System of Quadrotor Helicopter Control.|
|Next Article:||Multiscale Cooperative Differential Evolution Algorithm.|