That the hardware scenario is changing radically is no mystery. If we look at secondary memories, a lot of things have changed in the last 10 years, especially when it comes to disks.

We have gone from considering primary memory, disks, and tapes, where disks were only differentiated according to their speed of rotation (15k, 10k, 7k rpm), to a much more articulated scenario with the introduction of Flash memories and/or persistent memories.

And this is at the primary memory level, with NVDIMM and PMEM, but also at the second memory level with the introduction of the new SSD level:

In turn, the SSD level is particularly unscrupulous, in NVMe type disks or disks with traditional SAS or SATA interface. And we don’t go into the different construction details of Flash memories, where there are different technologies, each with different speeds and performance characteristics.

But what is the point of all this with Operating Systems? This needs to be adapted to respond to the new case histories and peculiarities of these devices.

For example, does it make sense to cache the filesystem for very fast NVMe storage? Maybe not, but it might make a lot of sense to have a write cache for a very “slow” SSD storage in writing.

But where does everything change is in the primary memory: since NVDIMM and PMEM are more capable (in terms of space), but slightly slower than “normal” RAM, does it make sense to do a tiering or caching between these primary memories? And even more important, because compared to RAM, these new memories are persistent (i.e. they keep the data even when the system is turned off), is it possible to implement systems to suspend, but also to restart the operating system infinitely faster?

So far nothing special, all processors have a higher density, lots of power, less space, less power consumption,

But if we analyze the internal architecture we notice many surprises and many differences compared to traditional microprocessors:

There are so many cores… yes, but the novelty is that the cores are different from each other: the subsystem uses Flex-Scheduling technology consisting of two cores based on Cortex-A76, two cores based on Cortex-A76 and four cores based on Cortex-A55.

Why different cores? Here’s the beauty: the faster and more powerful (but also more power-consuming) cores handle heavy workloads punctually during peaks, the high efficiency (thermal and power) cores provide performance for most workloads, and the ultra-efficient cores allow you to handle low-consumption-load background tasks while minimizing power consumption when, for example, the system is in “quasi-idle”.

The GPU included in this processor is a Mali-G76 with clock boosting technology designed for computing, gaming or AI applications.

But the beauty comes from the two NPUs (Neural Processor Units) that allow specific AI functions with increased speed and complexity compared to a generic GPU or CPU. For example for image recognition capabilities.

With a similar processor, it certainly takes a redesigned operating system to exploit the cores intelligently and selectively according to the workload.

The GPU and NPU part may be left to applications, although some of the operating system features may benefit these peripherals.

Share