Printer Friendly

The process of recovering image and web page artifacts from the GPU.

1 INTRODUCTION

With the rise of the global internet over the past decades, the use of digital technology became an essential part of human life, which, inevitably, also gave rise to cybercrime. This shift toward digital has led to the rise of a new area of forensic science, digital forensics, which is concerned with collecting and examining potential digital evidence from computers, networks, and mobile phones. As a result, a new area of forensic science, data recovery, has been gaining a momentum in the investigation of digital devices. Potentially critical data is being processed and saved for a limited time in the global memory of graphics processing units (GPUs) that can benefit digital forensic investigations. Due to the volatility of data, few researchers have implemented forensics methods to recover artifacts from the GPU. To the best of the authors knowledge, [7] was the only major work that has attempted to recover graphic images from a GPUs global memory dump using CUDA. However, the work was limited to 200x200 pixel TIFF files, and not all of the tested images were successfully restored. This paper introduces a data recovery process for graphic and webpage artifacts from the global memory of GPUs. Building on prior work attempting to recover graphics from the global memory dump of GPUs using CUDA (Zhang, 2015), we consider an enhanced data recovery method for retrieving inaccessible, lost, or deleted data using the Open Computing Language (OpenCL) framework. To evaluate the data recovery capabilities of OpenCL, we test several image formats, in different pixel sizes, and on different operating systems. Since the OpenCL framework is widely supported by GPU vendors, our proposed approach is applicable to a good variety of GPUs. We also consider how the choice of GPU driver affects the data recovery process. Due to the large variety and possible combinations

of available hardware and software, implementing the OpenCL data recovery method to GPUs faces three major challenges: 1) elusive global memory allocation scheme of GPUs; 2) varying levels of support for different GPU drivers; and 3) the prerequisite of using the types of OS and applications on which the recovery process is applicable. The rest of the paper is organized as follows. Related work is reviewed in Section 2, followed by the description and illustration of the proposed GPU graphics recovery process in Section 3. Results from the experiments are examined and discussed in Section 4. The paper concludes our work and gives an outlook on future research directions in Section 5.

2 LITERATURE REVIEW

Offloading graphics processing tasks to the GPU has led to substantial improvements in the performance of graphical computations. The deployment of GPUs has further increased with the emergence of general-purpose graphics processing units (GPGPUs). While studying how GPU-assisted malware affect memory forensics, the authors of [1], found that GPUs can assist applications to achieve a substantial speed-up and enhance the performance of various applications, including financial and scientific computations. The authors further posit that GPUs have enabled the realization of video transcoding, bitcoin mining, recovering passwords, and regular expression matching. However, they note that despite the GPUs ability to perform generic computations, GPU misuse, i.e., the use of GPUs to engage in malicious activity, has not been studied sufficiently. To perform a forensic analysis on the GPU, the authors gathered and analyzed numerous data structures by developing several custom tools specifically for this purpose. The data structures that were examined in the study include graphic page tables, hangcheck flags, a list of buffer objects, a list of contexts, and the register files. The study also revealed that the use of various GPU ecosystems posed substantial challenges in the forensic process. This, therefore, makes it necessary to develop individual tools for the probable combinations of GPU simulations and operating systems. GPUs may also be implemented to solve both general tasks and tasks that require intensive computations. For instance, GPUs may be used to increase the performance of AES and RSA encryption algorithms. Similarly, GPUs can be implemented to accelerate routers to support IP networks [4]. GPUs may also aid in the establishment of high-speed intrusion detection systems (IDSs). Researchers in [4], aimed at evaluating the potential risks associated with GPUs and, more specifically, how attackers are capable of disclosing sensitive data stored in the GPUs memory. While performing an in-depth analysis on GPUs to detect security 3 susceptibilities, the authors discovered that extensively used GPUs, namely NVIDIA's and AMD's, fail to initialize recently allotted GPU memory pages that are likely to contain delicate user data. This vulnerability may then be exploited through attack strategies so that the program data belonging to the victim can be revealed, particularly information stored in the GPUs memory. Such exploitations may happen both during the execution of a program and after its termination. The greatest number of these attacks targeted the Chrome and Firefox web browsers that render web pages through the GPU. The research also indicated that, regardless of their wide application in the computing industry, the security issues associated with GPUs have not been given the necessary consideration [4]. Random Access Memory (RAM) analysis is similar to the forensic analysis of GPUs, however, it is concerned with analyzing volatile information from the RAM relating to executable applications, network links, as well as the command history [2]. Like GPU forensics, memory forensics is affected by the fact that RAM, being a volatile memory, loses data immediately when the power is interrupted. However, under certain favorable conditions such as uninterrupted power and the computer not being locked, a forensic investigation of the RAM can still be conducted within a particular time frame and using specialized tools. The forensic analysis of RAM may require copying the RAM's contents to perform a comprehensive analysis of the memory dump, while in other cases it requires the retrieval of Unicode string content or ASCII [6]. The authors of [7], conducted three experiments, the Color Test, the Line Test, and the Color Map Pattern Test, respectively, to explore the formatting pattern of images. The Color Test, in which different data were introduced onto the screen, was aimed at exploring the data structures of colors within the GPUs memory. The evidence was then collected from the GPUs memory with the help of an enhanced model. The researcher then created eight different color representation squares using Photoshop to prepare the evidence. The squares were given individual values such as 000000, 00FF00, FFFF00, among others. After the analysis of the data structures and the deletion of empty memory spaces, the researcher successfully recovered the photo [7]. The limitations of this paper are that the author only tested one image size, which is 200x200 pixels, and used CUDA, which is limited to NVIDIA GPUs. To improve upon these results, our paper will test different image formats in different sizes using OpenCL, which is supported by multiple GPUs such as AMD, Intel, and NVIDIA.

2.1 Open Computing Language (OpenCL)

OpenCL is a framework used to write programs through its execution across diverse platforms containing CPUs, GPUs, and other processors [3]. OpenCL offers several distinct advantages as compared to CUDA. For example, the mathematical precision in OpenCL is well-defined, whereas in CUDA it is undefined. Furthermore, while OpenCL is supported by many GPU vendors such as AMD, Intel, and NVIDIA, CUDA is only supported by NVIDIA. And lastly, OpenCL provides CPU support, while CUDA doesnt. OpenCL contains certain specific functions that support the execution of commands. These services are necessary for the data transfer between the buffer objects and the host memory. The clEnqueueReadBuffer enqueues commands to read from, or write to, a buffer object to the host memory, while the clCreateBuffer is used to create the buffer object also from the host memory [3]. Both functions, therefore, puts the reading' and writing' command queue and therefore, they are commanded principal objects. The clEnqueueReadBuffer helps in the data transfer from the buffer object to host memory whereas the writing occurs from the host memory to the buffer. The reading of data by the clEnqueueReadBuffer requires an allocated area of the memory for the data storage, because the function cannot perform the memory allocation by itself.

3 METHODOLOGY

The methodology used for this experiment was built on the workflow defined by [7]. The design process consists of three stages. Stage 1 is to acquire potential unique pixel patterns by first cleaning the GPUs global memory, followed by computing conversion matrices between the image and the data retrieved from the GPUs global memory. Stage 2 simulates the live capture process with the assumption of no noise. The images to be tested were loaded onto the GPU and then captured as a memory dump, which are then restored to possible graphics by applying the unique patterns generated in Stage 1. In Stage 3, if the method is efficient, one of the recovered images will be found visually identical to the image previously loaded onto the GPU in Stage 2.

The experiment tests three image formats in three different sizes. The image formats that will be tested are JPEG, TIFF, and BMP. The image sizes that will be tested are 64x64 pixels, 100x100 pixels, and 200x200 pixels. Since memory allocation is difficult to predict, we first clean the memory in Stage 1 and Stage 2 in an attempt to ease the process of locating the dump data of the processed image.

3.1 Generating Patterns

The first step in this experiment is to generate patterns which later will be used to recover any image of the same size as the generated pattern. We generate a color map image for each image size as shown in Figure 2. The color map images were created using Adobe Photoshop and each pixel of the image has a unique color. The purpose of generating these color map images is to create a set of pixel patterns that we can later use to recover the test image. To ease the process of recovering the dump data of the image, we cleaned the global memory of the GPU using OpenCL. The color map image was processed through the GPU by simply opening and closing the image in Windows Photo Viewer. Then, the dump data of the processed color map image was recovered from the GPU using OpenCL. Once the color map image is processed and the dump data of the color map image is collected multiple times, a set of pixel patterns will be ready to recover any image of the same size. For this experiment, the image sizes were 64x64, 100x100, 200x200 256x256, 512x512, and 1024x1024 pixels.

Figure 3 shows the unique patterns that were recovered by processing the color map image multiple times. The same experiment is repeated with a 200x200 pixel image and a 64x64 pixel image. Table 1 shows and the number of unique patterns found for each size.

3.2 Data collection

To validate whether the generated pixel patterns can recover images of the same size, it was necessary to first perform a test with a random image. For this study, a random image was opened and then closed in Windows Photo Viewer. We chose this program because it uses the GPU, supports many image file formats, and is one of the most commonly used image viewer applications. After loading the image into the GPUs memory, we used the OpenCL functions clEnqueueReadBuffer and clCreate-Buffer, to recover the dump data of the random image.

The above image shows an example of dump data that represents a 100x100 pixel image which can be used later to recover the image in its original shape. The offset 8A40000 marks the start of the image.

Noise removal is a critical step in the process of recovering images from the GPU. The highlighted part above shows noise that needs to be removed to successfully recover the image. Other noise that needs to be removed is the following:

1. rows with all 00

2. rows with all FF

3. FA F3 EE FF FA F3 EE FF FA F3 EE FF FA F3 EE FF

4. FA F3 EE FF FA F3 EE FF 00 00 00 00 00 00 00 00

3.3 Image and webpage recovery process

After recovering the pixel patterns shown in Section 3.1 and setting the stage to recover the image, the patterns that store the image data were used to map the recovered image to its original state.

The patterns generated in Section 3.1 show the structure of the image in the GPU. Processing the color map image multiple times will yield different pixel combinations. To successfully recover a random image, we use the generated patterns to find the correct combination of pixels that matches the random image that was processed by the GPU. It should be noted that [7] invented the technique of Pattern mapping, and this paper enhanced it to support different types of images and sizes as well as webpages.

Since there is no way to know the resolution of the image we want to recover, we assumed that its size is 1024x1024 pixels. The GPU used for this experiment was a GTS450 from the NVIDIA Fermi family. It has high-performance capabilities, with a 2x ability and DirectX 11 geometry processing power, and a maximum memory size of 1,024 MB [5].

3.4 GPUs and drivers test

A test for multiple drivers will be conducted to test the effects of different GPU drivers on the Graphics recovery process. According to [9], the driver of the GPU plays an essential role in accessing the GPU. Therefore, several GPU types will be tested to understand which GPUs support the proposed method and which do not.

3.5 Operating System test

The OS is one of the major factors that influence the process of recovering graphics from GPUs. The aim of the OS test is to measure the influence of the different operating systems on the recovery process. For this study, the NVIDIA GPU GTS450 was tested on three operating systems: Windows 7, Windows 8, and Windows 10.

4 RESULTS

The results of the experiments conducted in Section 3 will be evaluated and discussed in the following subsections.

4.1 Image recovery results

Several images with different sizes and formats were tested in this paper. In order to ensure the accuracy of the results, we tested ten different images for each size to determine if the generated unique pixel patterns are enough to recover the image of the same size or not.

Table 2 shows the number and the type of the successfully recovered images. It can be observed from the experiment that the patterns generated in Section 3.1 were enough to recover the images of the same size. Although a fully recovered image was not obtained in the case of 256 and 512 image resolutions, the recovered artifacts have enough information that indicate what the image is about which can help forensics investigators.

[7] indicated that his experiment was looking for a single correct pixel pattern. However, based on the results in Figure 8, there is not one specific pattern for each image size, but rather a set of unique patterns. As shown in Figure 8, the recovered images shift between the four unique patterns generated in Section 3.1. All ten images tested were recovered successfully using just four patterns for the 100x100 pixel image. For the 64x64 pixel image, two unique patterns were discovered, and all the ten different images tested match those two unique patterns as shown in Figure 9.

When testing larger images, the number of patterns increased as shown in table 1. The successful pixel pattern for figure 10 image was pattern 2. It is important to note that using different GPUs could result in different pixel pattern combinations and could yield both more or less number of unique patterns for each image size. The Color Map images that were used to generate the pixel patterns in Section 3.1 are in the TIFF format. Using these patterns, we were able to recover JPEG and BMP images from a pattern that was originally generated from a TIFF image. Thus, we conclude that the image format does not influence the recovery process. A 256 by 256 image were tested to see if the method will still be of use or not.

With nine unique patterns, the process of recovering a full image was not successful due to the increase in the resolution. This indicates that the higher the resolution the more patterns needed to recover the image.

To confirm the resolution hurdle, another larger image was tested. A 512 by 512 image. This time the test was implemented with fourteen unique patterns to increase the chances of recovering a full image. However, the process was not successful and fully recovered image was not present. That said, the recovered artifacts are in a viewable state and has enough information to tell what the image is. Also, a fully recovered image can be obtained by generating more patterns from the GPU to increase the chances of getting a full recovered image.

The results of the recovery process of the 1024x1024 image are surprising as it shown in figure 13. Not only the patterns shows a clear visual but also one of the nine unique patterns successfully reconstruct the image back to its original shape. However, this does not mean that every 1024x1024 image can be recovered successfully because the test was not successful on all the three images that were tested but only one image, also, the results of the 256x256 and 512x512 shows unsuccessful recovered images, which means more unique patterns are needed for larger images.

4.2 Webpage recovery

The webpage recovery process followed the same approach as recovering the images from the GPU, except this time our aim was to test the possibility of recovering artifacts of the last visited webpages. The browser used for this experiment is Google Chrome, and we edited the NVIDIA control panel to enable the GPU to operate whenever the user uses Google Chrome. This was necessary because, by default, when visiting websites, the GPU will not work unless a web extension requires GPU involvement.

We opened a Facebook account page to test the possibility of recovering any artifacts from the GPU after visiting the webpage. About 40 percent of the webpage content was recovered successfully and with high legibility, although other parts of the recovered webpage, about 60 percent, cannot be read or recognized. In a case where a suspect is under investigation, recovering 40percent of the last visited page has good potential for solving the case, e.g. in the case of recovering illicit content from the suspects machine.

Another webpage was tested to ensure that the recovery process is accurate. In figure 14 artifacts of last visited twitter page. The artifacts clearly shows the NVIDIA logo with some clear texts. These information can be of use for forensics investigators. However, obtaining these information without memory clean is hard due to the extraordinary amount of noise presented in the dump file.

4.3 GPU and driver results

Several GPUs and drivers were tested to gain a proper understanding of which GPUs and drivers support the graphics recovery process and which do not.

Table 3 shows the GPUs and the drivers that did not support the graphics recovery process.

In these cases, every time OpenCL tried to collect the dump data, the dump file returned blocks of only zeros, indicating that those sections did not contain any data. To overcome this barrier, [7] used the driver 340.34 which supports the methods presented in this paper.

Drivers 340.43 for the GTX560M GPU and 340.62 for the GTS450 support the method presented in this paper. All beta drivers of NVIDIA for Windows 7 and Windows 8 support the method introduced in this paper and allow the data collection process. On the other hand, it was not possible to recover dump data from the AMD Radeon GPU as the return value of the dump file is zero.

4.4 Operating system (OS) results

Several OSs were tested to measure the impact of using different OSs on the GPU forensics process introduced in this paper.

The OSs tested were Windows 7, Windows 8, and Windows 10. The pixel patterns recovered in Section 3.1 were recovered using Windows 7, and once the system was upgraded to Windows 8 the combination of the patterns changed. Therefore, new patterns had to be generated in order to successfully recover the tested image. The recovery attempt on Windows 10 was not successful because the new OS supports only the latest drivers, which, as indicated in Section 4.2, do not support the method introduced in Section 3. The GTS 450 GPU was moved to a new desktop with the same OS tested in Section 3 (Windows 7), which yielded different patterns than those generated on the previous desktop.

In conclusion, the only way to generate the correct set of pixel patterns is using an identical OS platform and GPU as the ones used for previously opening the image we want to recover. In a real setting, it would mean that the investigator needs to clone the suspect storage device, recover the dump data from the GPU and then generate the set of patterns needed to recover the processed images. The only obstacle would be locating the dump data that is linked to the processed image. In this experiment, this issue has been tackled by first cleaning the memory, but in a real setting cleaning the memory is not an option.

4.5 GPU forensics challenges

We considered several factors to determine their potential influences on the image recovery process introduced in Section 3. It was observed that the only set of circumstances in which an investigator can generate the correct image patterns is when the same OS platform and GPU are used. Using a different OS or a different GPU will not help in generating the correct patterns. In fact, even having the same OS will not generate the correct patterns. The limitations to applying forensics to GPU are as follows:

* In a real setting, the memory cleaning process presented in Section 3.1 is the most obvious barrier, because the investigator cannot clear the global memory, otherwise the dump data stored there will be lost.

* As indicated in Section 4.2, not all drivers support the forensics process.

* As indicated in section 4.4, the Windows 10 OS didnt support the proposed method, thus, the pixel patterns could not be identified.

5 CONCLUSIONS

Since the use of digital technology has become an indispensable part of human life, the need for digital forensics is evident. GPUs hold valuable data that could very well solve cases. However, implementing forensics techniques on volatile data and volatile memory has its own challenges due to the volatility. This paper discussed a method using OpenCL to recover graphics content and webpages from GPUs. During the research, all of the tested images were recovered successfully using a set of unique pixel patterns. In sum, it was observed that the larger the image size, the more unique pixel patterns there are. Although the proposed technique could only partially recover the webpage, the recovered data, in this case, provided enough information to determine what the user was reading and browsing. GPUs consist of different types of memories, and each type of memory holds different types of data which can be helpful for forensic investigations. As future work, testing different types of data other than images and webpages as well as more AMD GPUs is essential. Once we have addressed the challenges presented in this research, the ultimate goal of developing a forensics model for GPU analysis can be realized.

ACKNOWLEDGEMENT

This project is partially funded by Intel Grant #21626857.

REFERENCES

[1] Balzarotti, D., Di Pietro, R., and Villani, A.: The impact of GPU-assisted malware on memory forensics: A case study. Digital Investigation, 14, S16-S24. (2015).

[2] Garcia, G. L., Forensic physical memory analysis: an overview of tools and techniques. Paper presented at Helsinki University of Technology, Helsinki, Canada. (2007).

[3] Howes, L., and Munshi, A. The OpenCL Specification. Khronos Group. (2015).

[4] Lee, S., Kim, Y., Kim, J., and Kim, J. Stealing Webpages Rendered on Your Browser by Exploiting GPU Vulnerabilities. 2014 IEEE Symposium on Security and Privacy. (2014).

[5] NVIDIA. NVIDIA GeForce GTS 450. Retrieved from http://www.nvidia.co.uk/object/product-geforce-gts-450-uk.html (2017).

[6] Urrea, J. M. An analysis of Linux RM forensics (Unpublished master's thesis). Naval Postgraduate School, Monterey, CA. (2006).

[7] Zhang, Y. Recovering image data from a GPU using a forensic sound method. Purdue University, West Lafayette, Indiana. (2015).

[8] Ladakis, E., Koromilas, L., Vasiliadis, G., Polychronakis, M., and Ioannidis, S. You Can Type, but You Cant Hide: A Stealthy GPU-based Keylogger. In 6th European Workshop on System Security (EuroSec). (2013).

[9] In Lin, H.-X, In Alexander, M., In Forsell, M., In Knu pfer, A., In Prodan, R., Sousa, L., Streit, A. Euro-Par 2009 - Parallel Processing Workshops: HPPC, HeteroPar, PROPER, ROIA, UNICORE, VHPC, Delft, the Netherlands, August 25-28, 2009, Revised Selected Papers. (2010).

Yazeed Albabtain Baijian Yang

Department of Computer and Information Technology, Purdue University 401 N.Grant Street, West Lafayette, IN 47907 {yalbabta, byang} @purdue.edu
Table 1. Total number of an identified patterns per image.

Image size  Generated patterns  Unique patterns found

64x64       20                   2
100x100     35                   4
200x200     45                   9
256x256     50                   9
512x512     62                  14
1024x1024   20                   9

Table 2. The successful results of the image recovery
process.

Image size  JPEG  BMP  TIFF

64x64       4     3    3
100x100     4     3    3
200x200     4     3    3
1024x1024   0     0    1

Table 3. The unsuccessful results of the image recovery
process.

Image size  JPEG  BMP  TIFF

256x256     4     3    3
512x512     4     3    3
1024x1024   1     1    0

Table 4. Unsuccessful recovery attempts.

GPU                 GPU Drivers

AMD Radeon HD 6770  15.7, 14.12, 14.4, 12.1, 13.1, 13.4 and catalyst
GTX560M             378.49, 353.9
GTX960M             All drivers from 341.81 notebook win10 64bit
                    international To 359.06

Table 5. Successful recovery attempts.

GPU      GPU Drivers

GTX560M  340.43
GTS450   340.62

Table 6. Operating system test results.

Operating System                  Same identified patterns?

Windows7 [right arrow] Windows 7  No
Windows7 [right arrow] Windows 8  No
Windows 10                        Couldn't generate any patterns

Operating System                  GPU

Windows7 [right arrow] Windows 7  GTS450
Windows7 [right arrow] Windows 8  GTS450
Windows 10                        GTX960
COPYRIGHT 2018 The Society of Digital Information and Wireless Communications
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2018 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Albabtain, Yazeed; Yang, Baijian
Publication:International Journal of Cyber-Security and Digital Forensics
Article Type:Report
Date:Apr 1, 2018
Words:4413
Previous Article:Forensics Analysis of Skype, Viber and WhatsApp Messenger on Android Platform.
Next Article:Moving Towards Cloud Analyzing the Drivers and Barriers to the Adoption of Cloud Computing in HE (Higher Education) institution in UK: An Exploratory...
Topics:

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters