Haisal Dauda Abubakar, Sunusi Abu Darma
The demand for deep learning models in real-time systems, such as autonomous vehicles and healthcare diagnostics, has grown significantly due to their ability to handle complex tasks like object detection, decision-making, and medical image analysis. However, these models are computationally expensive, making them unsuitable for deployment on resource-constrained devices. To address this challenge, optimization techniques such as pruning, quantization, and transfer learning are becoming essential. This paper explores these techniques in detail, highlighting their contributions to improving the performance of deep learning models without compromising accuracy. We discuss their practical applications in real-time systems, and present a comparative analysis of how they impact model size, inference speed, and computational efficiency. The findings suggest that a combination of these techniques can effectively enhance the performance of deep learning models in autonomous vehicles, healthcare diagnostics, and other real-time applications.