Inferring Occupant Counts from Wi-Fi Data through Random Forest.
Inferring occupant counts has wide applications in building control and optimization, such as lighting and HVAC schedule optimization, energy benchmarking, Fault Diagnosis and Detecion, and etc. Tranditional occupancy detection methods (CO2 sensor, Radio Frequency (RF), camera-based sensors, etc.) require installing additional sensors or hardware equipment, which leads to extra cost and labor (Yang et al., 2016). With the wide deployment in almost every building nowadays, Wi-Fi infrastructure provides internet connections and thus offers a unique opportunity for virtual sensing of occupant count (Pritoni et al., 2018). Despite the rapid technology development and promising application potential, the reported methods using Wi-Fi data to infer occupant count have two limitations: (1) some technologies require installing extra apps on the Access Point or end-use devices; and (2) the other require recording the MAC addresses of connecting devices (Wang et al., 2017), which would raise privacy concerns. Therefore, there is still a strong demand to infer occupant count in an accurate and non-intrusive way, i.e., using the existing information infrastructure in buildings and not requiring installation of extra hardware or software packages (Akkaya et al., 2015).
The method we use is constituted of three major steps: data collection, feature engineering and estimator development. In the data collection process, we avoid privacy concerns by anonymizing and reshuffling the MAC addresses every day.
The key step we utilized in this paper to improve the model accuracy is we clustered the Wi-Fi connected devices into different categories based on their connection periods. A major reason using Wi-Fi connection counts alone could not accurately infer occupant counts is the mapping relations between the number of connected devices and the number of occupants are not consistent and might change temporally and spatially. As shown in Figure 2(a), there are different types of Wi-Fi connection devices, which belong to different types of owners, subject to different mapping rules of Wi-Fi connection counts and occupant counts (for instance, two devices per occupant vs. one device per occupant). It is reasonable to assume the estimation accuracy could be improved if we could differentiate various types of devices based on their daily connection duration, and use that information into the machine learning algorithm.
We tested the method in an office building located in California. We used 3 weeks of data to train our model and another 2 weekds to test it. The result is presented in Figure 2. In an area with an average occupancy of 22-27 people and a peak occupancy of 48-74 people, the root square mean error (RMSE) on the test set is less than four people. The prediction error is within two people counts during more than 70% of time, and less than six people counts for 90% of time, demonstrating a relatively high prediction accuracy, compared with existing methods (Jiang et al., 2016; Wang et al., 2017; Wang et al., 2018).
The random forest provides us a chance to revisit the topic of feature engineering we discussed in the previous section. As shown in Figure 3, the number of long-term connected devices is more important for occupant count estimation than the number of short-term connected devices. Features of the time are not very important to infer occupant counts because the information the time features could bring, have already been reflected by the Wi-Fi connection counts, as Wi-Fi connection counts demonstrate a similar periodic variation.
As for the next step, we plan to collect data from different buildings to test if the estimator trained in one office building could be transferred and applied to another building without retraining the model. This is critical since collecting the ground truth data - in this case the real occupant counts - is expensive in real world. Wi-Fi data is available in many buildings but very few buildings have occupant counts data. It would also be helpful if researchers in this field could open-source their data and establish a shared database for the testing and comparison of new methods and algorithms.
This research was supported by the Assistant Secretary for Energy Efficiency and Renewable Energy, Office of Building Technologies of the United States Department of Energy, under Contract No. DE-AC02-05CH11231.
Akkaya, K., Guvenc, I., Aygun, R., Pala, N. and Kadri, A., 2015, March. IoT-based occupancy monitoring techniques for energy-efficient smart buildings. In Wireless Communications and Networking Conference Workshops (WCNCW), 2015 IEEE (pp. 58-63). IEEE.
Jiang, C., Masood, M.K., Soh, Y.C. and Li, H., 2016. Indoor occupancy estimation from carbon dioxide concentration. Energy and Buildings, 131, pp.132-141.
Pritoni, M., Nordman, B. and Piette, M.A., 2017. Accessing WI-FI data for occupancy sensing.
Yang, J., Santamouris, M. and Lee, S.E., 2016. Review of occupancy sensing systems and occupancy modeling methodologies for the application in institutional buildings. Energy and Buildings, 121, pp.344-349.
Wang, W., Chen, J. and Song, X., 2017. Modeling and predicting occupancy profile in office space with a Wi-Fi probe-based Dynamic Markov Time-Window Inference approach. Building and Environment, 124, pp.130-142.
Wang, W., Chen, J., Hong, T. and Zhu, N., 2018. Occupancy prediction through Markov based feedback recurrent neural network (M-FRNN) algorithm with WiFi probe technology. Building and Environment, 138, pp.160-170.
Zhe Wang, PhD
|Printer friendly Cite/link Email Feedback|
|Author:||Wang, Zhe; Hong, Tianzhen|
|Date:||Jul 1, 2019|
|Previous Article:||Break-Point Statistical Formulations for Whole Building Heating Hot Water Consumption Modeling.|
|Next Article:||Experimental Test Chamber for Testing Innovative PCM Building Envelope Assemblies in Full Scale Applications.|