Introduction to Cache Memory

Adding Cache to IPs and SoCs

Cache memory significantly reduces time and power consumption for memory access in systems-on-chip. Technologies like AMBA protocols facilitate cache coherence and efficient data management across CPU ...

Nvidia says it can shrink LLM memory 20x without changing model weights

Nvidia's KV Cache Transform Coding (KVTC) compresses LLM key-value cache by 20x without model changes, cutting GPU memory costs and time-to-first-token by up to 8x for multi-turn AI applications.

14d

New KV cache compaction technique cuts LLM memory 50x without accuracy loss

MIT researchers developed Attention Matching, a KV cache compaction technique that compresses LLM memory by 50x in seconds — ...

PC World

How does CPU memory cache work?

In the eighties, computer processors became faster and faster, while memory access times stagnated and hindered additional performance increases. Something had to be done to speed up memory access and ...

Ars Technica

Adding cache memory to an ASUS P5A-B...

The cache is soldered to the board, so yer out of luck there. In theory, the Aladdin 5 could cache up to 512, but the early chipsets had a flaw in the cache tag RAM that caused the 128 MB limitation.

TMCnet

Penguin Solutions Introduces Industry's First Production-Ready CXL-Based KV Cache Server

Accelerating memory-dependent AI processes, Penguin's MemoryAI KV cache server increases memory capacity by integrating 3 TB of DDR5 main memory and up to eight 1 TB CXL Add-in Cards (AICs). Penguin ...

Semiconductor Engineering

A Primer On Last-Level Cache Memory For SoC Designs

System-on-chip (SoC) architects have a new memory technology, last level cache (LLC), to help overcome the design obstacles of bandwidth, latency and power consumption in megachips for advanced driver ...

Hackaday

Spectre And Meltdown: How Cache Works

The year so far has been filled with news of Spectre and Meltdown. These exploits take advantage of features like speculative execution, and memory access timing. What they have in common is the fact ...

EDN

MRAM technologies: from space applications to unified cache memory?

Magneto-resistive random access memory (MRAM) is a non-volatile memory technology that relies on the (relative) magnetization state of two ferromagnetic layers to store binary information. Throughout ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results