Quantcast
Channel: Active questions tagged cpu-architecture - Stack Overflow
Browsing all 106 articles
Browse latest View live

Why is it quicker to calculate the reciprocal square root than to compute the...

On uops.info VRSQRTPS is listed as having a lower latency than VSQRTPS across all the architectures I've checked. It also has a lower throughput but perhaps there are less units that can do it on most...

View Article



Seeking Verification: MIPS Cache Set Update Analysis

I've been working on a MIPS cache problem and wanted to double-check my solution. Here's the breakdown:Problem: Determining which sets of a direct-mapped data cache have been updated after executing a...

View Article

How does a TLB manage memory translation for addresses that cross page...

Let's say we have a page size of 4096 Bytes, and we have two contiguous virtual memory pages mapped to discontinuous physical pages, i.e[x , x + 4096 * 2] - Maps to -> [A, A + 4096], [B, B + 4096]x...

View Article

Windows vs Linux VM L3 Cache Size Discrepancy

I have a query regarding the disparity in L3 cache size between native Windows and a Linux virtual machine (VM) running on it.According to my CPU architecture documentation, the L3 cache size is...

View Article

Why would there be too many memory accesses in my cache simulator?

I have this code and I keep getting too many hits and too many misses. I am trying to create a cache simulator. I don't understand if the part where I see if it is a write is correct or not but also I...

View Article


The SUB instruction in CPU [duplicate]

Consider the 8-bit CPU only. Let’s say we have two registers, R1 and R2, when performing SUB R1, R2, R2 will be first converted to its two’s complement and than add with R1, but I’m wondering, what if...

View Article

PMU reports no L1 cache loads when working with 1-3 cache lines - where does...

I run a dummy program that accesses an array of L1-cache-line sized data, to test how the cache architecture shows up in the miss rate etc metrics. Unexpectedly, when the program access only 1-3 lines,...

View Article

How does benchmark run on the simulator?

I'm making 32-bit MIPS simulator with cpp just for studying.But I can't understand how benchmark programs are run on simulator.Also, I don't know how code of benchmark programs are structured.Can you...

View Article


Is HyperThreading / SMT a flawed concept?

The primary idea behind HT/SMT was that when one thread stalls, another thread on the same core can co-opt the rest of that core's idle time and run with it, transparently.In 2013 Intel dropped SMT in...

View Article


Code wont stop running in MIPS assembly simulation

main: jal addNode # Call addNode to insert the new node # Assuming addNode returns the new node's value in $v0 move $s3, $v0 # Move the returned value to $s3 j end # Jump to end to halt the...

View Article

Turing machine vs Von Neuman machine

BackgroundThe Von-Neumann architecture describes the stored-program computer where instructions and data are stored in memory and the machine works by changing its internal state, i.e an instruction...

View Article

Interconnection of circuits in OpenFPGA

It is known that we use syntax to Interconnect Tiles of FPGA in XML. Similarly can we use to Interconnect sub-circuits inside a CLB?I am expecting that we can either mention the port name of previous...

View Article

Write-back vs Write-Through caching?

My understanding is that the main difference between the two methods is that in "write-through" method data is written to the main memory through the cache immediately, while in "write-back" data is...

View Article


How are CPU instruction dependency graphs built?

We talk a lot about out of order and superscalar processors having optimisations viable only after we verify that some instructions do not depend on one another.How are these dependency graphs built by...

View Article

Why do assembly programs use the register indirect addressing mode when they...

In what follows below, I use a generic assembly language as is done in my text (Computer Organization and Embedded Systems, 6e, by Hamacher et al.).Consider the code snippet below which is the setup to...

View Article


What's the difference between a 'fast' (instruction) syscall and...

From my understanding, the syscall/sysenter instructions and their companions were introduced in recent architectures to serve as a shorter path into the kernel. But I don't understand how it achieves...

View Article

Can managed code impact instruction level parallelism?

Is there any way I can impact Instruction Level Parallelism writing C# code? In other words, is there a way I can "help" the compiler produce code that best makes use of ILP? I ask this because I'm...

View Article


Can I pipeline a multi cycle risc v core and how?

I have a risc v multi cycle core picorv32, each instruction passes through 3 stages fetch, load registers and execute these are three main other operations are also being performed so i wanna pipeline...

View Article

Why do fast memory writes when run over multiple threads take much more time...

I have a program which allocates some memory (200 million integers), does some quick compute and then writes the data to the allocated memory.When run on a single thread the process takes about 1...

View Article

Aligning to cache line and knowing the cache line size

To prevent false sharing, I want to align each element of an array to a cache line. So first I need to know the size of a cache line, so I assign each element that amount of bytes. Secondly I want the...

View Article
Browsing all 106 articles
Browse latest View live




Latest Images