Computer Architecture Simulation on GPU

Buitrago Paniagua, John Byron

Por favor, use este identificador para citar o enlazar este ítem: https://hdl.handle.net/10495/38543

Título :	Computer Architecture Simulation on GPU
Otros títulos :	Simuación de Arquitectura de Computadores en GPUs
Autor :	Buitrago Paniagua, John Byron
metadata.dc.contributor.advisor:	Velasquez Ricardo, Rivera Fredy Velásquez Vélez, Ricardo Andrés
metadata.dc.subject.*:	Simulación por computadores Computer simulation Métodos de simulación Simulation methods Arquitectura de computadores Computer architecture
Fecha de publicación :	2024
Resumen :	ABSTRACT : Due to the power wall in computer architectures [33], processor manufacturers have adopted a strategy of increasing the number of cores on modern computers to enhance their performance and reduce power consumption [100][93][21]. As a result, research and innovation are now focused on integrating more cores on a single chip and improving communication and memory systems, resulting in a complex design space for these architectures [93][110]. Due to processor complexity outpacing processor performance, computer architecture simulators tend to slow down over time [28][36]. This fact necessitates more advanced simulation strategies and tools. Computer architects have proposed various techniques to speed up their simulators by reducing their workload and separately exploiting their parallelism [89] [41] [104] [15] [54] [29] [55] [34]. This work aims to accelerate a detailed simulation of an uncore in a multicore architecture that supports multithreading workloads. The work proposes a methodology that integrates the core models technique [16] that abstracts the core behavior and reduces the time needed to simulate the cores. It also incorporates techniques that leverage the parallel capacity of GPU platforms to speed up the simulation. The methodology includes establishing a core model, a target uncore, their parallelization process, and their inter- action. Additionally, it involves simulator task distribution and optimizing GPU process runtime. The proposed methodology is evaluated through a case study, testing simulations of architectures from 1 to 128 cores to evaluate the speedup scaling and estimate the simulation’s accuracy. The reference model consists of a sequential CPU GEM5-based version. The simulation cycles and the number of misses for each cache in the architecture are compared for accuracy. The experimental results show that increasing the number of cores to be simulated in the architecture increases our GPU-based simulation’s speed compared to the CPU version. The parallel simulator shows an acceleration starting at 64 cores. In large part of the results, the accuracy for the simulation cycles has a relative error lower than 10%. The accuracy for the number of misses is a consistent metric through all cache levels. The relative error in most cases is lower than 5%. However, performance and accuracy metrics were dependent on workload type. Some simulations presented relative errors of around 40% for the simulation cycles, and some benchmarks did not reach any speedup at 128 cores.
Aparece en las colecciones:	Doctorados de la Facultad de Ingeniería

Ficheros en este ítem:

Fichero	Descripción	Tamaño	Formato
BuitragoJohn_2024_ComputerArchitectureSimulation.pdf Until 2025-02-22	Tesis doctoral	1.91 MB	Adobe PDF	Visualizar/Abrir Request a copy

Mostrar el registro Dublin Core completo del ítem

Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons