AI BASED HOTSPOT PREDICTION IN MULTICORE PROCESSORS

Published May 15, 2024
 1008 hours to build
 Intermediate

Our project introduces a machine learning-based predictive model for multicore processors, accurately forecasting hotspots using real-time core performance and temperature data. Leveraging a long-short-term-memory-based recurrent neural network, our model achieves 87.88% accuracy, aiding proactive thermal management. With a user-friendly interface, real-time CPU metrics are graphically plotted against predicted hotspots, enhancing system reliability and performance.

display image

Components Used

Microprocessors - (Intel i5-7300U, AMD Ryzen 5 5600H, AMD Ryzen 3 3250U)
Intel i5-7300U: A mid-range dual-core processor commonly found in laptops, offering a balance between performance and power efficiency. AMD Ryzen 5 5600H: A high-end hexa-core processor optimized for gaming and demanding applications, providing excellent multi-threaded performance. AMD Ryzen 3 3250U: An entry-level dual-core processor suitable for basic computing tasks, offering decent performance
1
Pycharm IDE - Python 3.12
PyCharm IDE: A powerful integrated development environment (IDE) for Python programming, offering advanced features and seamless integration with Python 3.12.
1
Open Hardware Monitor
Open Hardware Monitor: A software tool used for monitoring various hardware components in a computer system, providing real-time data on temperatures, voltages, fan speeds, and more.
1
Description

Abstract:

This project proposes a machine learning-based predictive model for multicore processors to forecast hotspots, enabling targeted cooling strategies for improved thermal management. The model utilizes real-time core performance and temperature data, along with historical data, to predict hotspots before they occur, achieving an 87.88% accuracy rate. Additionally, a user-friendly interface is developed to visualize real-time CPU metrics and compare them with predicted hotspots, aiding proactive thermal regulation in computer systems. 

Introduction:

The escalating integration and technology scaling of multicore processors have heightened industry focus on thermal management. Efficient temperature control not only optimizes processor performance but also enhances its reliability. Traditional cooling systems often neglect individual core temperatures, leading to accelerated chip aging. Our proposed software model leverages AI technology to predict hotspots seconds before they occur, enabling targeted cooling strategies. Real-time core utilization data, collected via the Open Hardware Monitor and processed using Python's data science modules, trains a long-short-term-memory-based recurrent neural network model to accurately forecast hotspots. Validated against commercial chips (Intel i5-7300U, AMD Ryzen 5 5600H, AMD Ryzen 3 3250U), our approach offers cost-effective hotspot prediction with minimal computational overhead.

                                                                       (Image Created : Canva)

Methodology:

Pseudocode For Dataset Acquisition

Procedure 1: Collect CPU Metrics

  1. Necessary libraries are imported.
  2. A function get_cpu_metrics is defined.
  • WMI is initialized with the OpenHardwareMonitor namespace.
  • Empty lists for core temperatures and core performance are created.
  • Temperature and performance information is retrieved from the sensors.
  • If a sensor is of type 'Temperature' and its name starts with 'CPU Core', its value is appended to the core temperatures list.
  • If a sensor is of type 'Load' and its name starts with 'CPU Core', its value is appended to the core performance list.
  • If an error occurs while accessing the sensors, an error message is printed.
  • Core temperatures and core performance lists are returned.
  1. An empty list to store the data is created.
  2. The duration for data capture is defined.
  3. Data is captured for the specified duration.
  • While the current time minus the start time is less than the duration:
  • The function to get core temperatures and performance metrics is called.
  • A timestamp is created.
  • A dictionary with the timestamp is created.
  • Core temperatures are added to the dictionary.
  • Core performance metrics are added to the dictionary.
  • The dictionary is appended to the list.
  • A 1-second delay is implemented before capturing the next data point.
  1. A DataFrame is created from the list of dictionaries.
  2. The DataFrame is saved to an Excel file.
  3. The current working directory is printed.

                                                               (Image Source: Screenshot taken by us)

Pseudocode For Model Training

Procedure 2: Train Hotspot Prediction Model

  1. Necessary libraries are imported.
  2. The dataset from 'CoreMetrics.xlsx' is loaded into a DataFrame.
  3. The performance metrics and temperature of each core are defined.
  4. The dataset is preprocessed:
  • Non-numeric characters are removed from the dataset.
  • The values are converted to float.

                                                                     

  1. Feature data and target variables are normalized using MinMaxScaler.
  2. The dataset is split into training and testing sets.
  3. The input data is reshaped to fit the LSTM input shape (samples, timesteps, features).
  4. The LSTM model is built:
  • A Sequential model is initialized.
  • Two LSTM layers with dropout are added.
  • A Dense layer with neurons equal to the number of cores is added.
  • The model is compiled with mean squared error as the loss function and Adam as the optimizer.
  1. The model is trained on the training data.
  2. The weights of the trained model are saved.
  3. Predictions are made on the test set.
  4. The scaled predictions and actual values are inverse-transformed to get the actual temperatures.
  5. The mean squared error between the actual and predicted temperatures is calculated.
  6. The accuracy is calculated as 100 minus the mean squared error.
  7. The accuracy of the model is printed.

Model validation is the process of evaluating the performance of the trained model on a separate dataset, known as the validation dataset. This dataset is not used during the training process and serves to provide an unbiased evaluation of the model. Metrics such as accuracy, precision, recall, and F1 score are used to measure the performance of the model. In our case, accuracy would be the percentage of correct hotspot predictions made by the model.

                                                                    (Image Source: Screenshot taken by us)

Pseudocode for Hotspot Prediction

Procedure 3: Predict Hotspot Occurrences

  1. The weights of the trained LSTM model are loaded.
  2. The current CPU performance metrics are retrieved using the previously defined function.
  3. The input data is prepared for prediction:
  • Performance metrics are reshaped and normalized.
  • The normalized data is reshaped to fit the LSTM input shape.
  1. Predictions are made on the input data.
  2. The MinMaxScaler is fit on the entire temperature value.
  3. The scaled predictions are inverse-transformed to get the actual temperatures.
  4. The predicted temperatures for each core are printed.
  5. A threshold temperature for hotspot detection is defined.
  6. Hotspots are detected:
  • If the predicted temperature for any core exceeds the threshold, a hotspot is detected in that core.
  • The cores where hotspots are detected are printed.
  • The hottest core is determined:
  • The index of the core with the highest predicted temperature is found.
  • The hottest core is printed.
  • If no hotspot is detected, "No hotspot detected." is printed.

In conclusion, our methodology provides a robust and dynamic approach to predict the hotspot of a multicore processor. By leveraging real-time performance metrics and machine learning, hotspots can be predicted before their occurrence, enabling more efficient and targeted cooling solutions.

                                                                    (Image Source: Screenshot taken by us)

Backend Implementation:

Data Collection:

  • Utilize Python to interface with Open Hardware Monitor to collect real-time performance metrics such as CPU temperature, fan speeds, load, and clock speeds.
  • Implement a script to continuously capture these metrics at a specified frequency and store them in a structured format.

Feature Selection:

  • Use Python libraries such as pandas and NumPy to preprocess the collected data.
  • Implement statistical methods and machine learning algorithms to identify relevant features that correlate strongly with hotspot occurrences.

Model Training:

  • Develop an LSTM-based recurrent neural network using Keras to train the predictive model.
  • Configure the LSTM model with appropriate layers, dropout, and loss function.
  • Train the model on the preprocessed data using gradient descent optimization algorithms.

Prediction:

  • Implement prediction functionality to feed new data into the trained model and obtain hotspot predictions.
  • Integrate the prediction module with the data collection process to provide real-time hotspot predictions.

Frontend Implementation:

User Interface:

  • Develop a user-friendly interface using Python libraries such as Tkinter or PyQt.
  • Design the interface to display real-time metrics and hotspot predictions in an intuitive manner.

Visualization:

  • Utilize matplotlib or other visualization libraries to create graphical representations of performance metrics and hotspot predictions.
  • Implement interactive features to allow users to explore and analyze the data visually.

                                                                       (Image Source: Generated by us in Matlab)

Software Tools:

  • Python: Main programming language for implementation due to its versatility and availability of libraries.
  • Open Hardware Monitor: Used for real-time monitoring of system parameters.
  • Pandas: Utilized for efficient data manipulation and analysis.
  • NumPy: Used for working with arrays efficiently.
  • Keras: Employed for building and training deep learning models.

Hardware Tools:

  • Multicore processors: Used for testing the predictive model.
  • System capable of running Open Hardware Monitor and Python scripts: Required for collecting real-time data and executing the predictive model.

Testing:

  • Real-world data was collected from three different machines with distinct processors: Intel i5-7300U, AMD Ryzen 5 5600H, and AMD Ryzen 3 3250U.
  • The collected data was fed into the trained LSTM model, and predictions were compared with the actual hotspots observed in the test data.
  • The performance of the model was evaluated using accuracy, calculated as the percentage of correct hotspot predictions.
  • Accuracy of the model was reported as 87.13% for Intel i5-7300U, 89.14% for AMD Ryzen 5 5600H, and 75.18% for AMD Ryzen 3 3250U.

Conclusion:

  • Our predictive model blends data science, machine learning, and user interface design.
  • It offers real-time hotspot prediction for multicore processors based on monitored core performance and temperature data.
  • The user-friendly interface enhances accessibility and allows users to visually understand processor performance and prediction accuracy.

Final Output User - Interface:

                                                     (Image Source: Screenshot taken by us after create the UI)

Codes

Downloads

ai-hotspot Download

Institute / Organization

St Josephs institute of technology
Comments
Ad