Intel Expands Its Comfort Zone with New ARM-Powered FPGAs for Datacenters

Intel announced it is sampling its Stratix 10 FPGAs, the latest family of field programmable gate arrays that are designed to accelerate a number of datacenter workloads. The new devices, which Intel is calling “the most significant FPGA innovations in over a decade,” offer advanced features like embedded 64-bit ARM processors, second-generation High Bandwidth Memory (HBM2), and DSP blocks

The server applications Intel is targeting with the Stratix 10 family is somewhat tangential to Nvidia's and AMD's newest GPU accelerators, as well as Intel’s own Knights Landing Xeon Phi.. However Intel believes workloads such as signal processing, data compression, data encryption, storage management, and video encoding – in truth tough, practically any server-side application where data throughput is the driving criteria. With the DSP unit offering lots of hardwired flops, these devices can also be used for high performance computing.


The debugging set for MALT has been released


The first version of tool kit for MALT software development and debugging has been released. The kit includes emulator, debugger and profiler. The emulator enables to execute and debug MALT programs on general-purpose computers running under Unix-like systems. The emulator, its integrated GDB debugger and the profiler significantly simplify development and porting programs on MALT system and also make it possible to evaluate the efficiency of algorithm implementation on MALT without running it on real hardware.


Read more ...

Kilocore - World's First 1,000-Processor Chip


Image: The University of California

A microchip containing 1,000 independent programmable processors has been designed by a team at the University of California, Davis, Department of Electrical and Computer Engineering.


Read more ...

ISC High Performance 2016 Conference


ISC Hight Performance 2016


SC High Performance (ISC 2016) Conference, 19-23 June, attracted 3,092 attendees from 53 countries, as well as 146 companies and research organizations showcasing their technologies and services at the ISC exhibition. 


Read more ...

C-compiler for programmed accelerator for the MALT-Cv1 has been developed


We’ve developed a C-compiler which generates optimized code for programmed accelerator architecture. On target tasks the performance of the code generated by the compiler is 80% of the code performance written by a programmer in assembly language! The compiler has been developed with the use of domain-specific language (DSL) set for quick translator creation. Such DSL set enables to describe the main phases of translation. In particular, there are Prolog-like descriptions of program conversion rules and combinatorial approach to build a traversal strategy for intermediate representation graphs.


Read more ...

We've started to create the MALT-Cv1 netlist using 28 nanometer TSMC technology


We've started to create the MALT-Cv1 netlist using 28 nanometer HPC+ (high-performance computing) TSMC technology. Planned area of a chip - 12 mm2. Such area is ecological optimum for pilot batch manufacturing under MPW (Multi-Project Wafer). Estimated energy consumption on a target task is 1 W, which enables to achieve considerably higher energy efficiency calculations than on a CPU and GPU.


Read more ...

Assembler and emulator for programmed accelerator have been developed as parts of MALT


We’ve developed an assembler, maintaining algebraic syntax similar to the one used in C language. Along with that, a program, implemented on the assembler, is also proper for C language. That beneficial side effect of the use of algebraic notation enabled to implement system software modeling for programmed accelerator with high performance via a normal C compiler.


Read more ...

Leopard processor element for vector MALT coprocessor has been designed


The development of a processor element for vector accelerator Leopard has been accomplished. The processor element architecture has been chosen according to the requirements for maximum flexibility (from a programming perspective) at high performance and energy efficiency on target tasks. As a result, the architecture based on ALU tree has been chosen.


Read more ...

The development of MALT-processors with vector and mixed architecture has been started


The development of MALT-processors with vector and mixed architecture has been started.


Solutions of mathematical tasks with the ultimate level of complexity from the field of discrete mathematics with perfect or almost perfect parallelization of data with regards to compact (according to core size) and particularly complex mathematical procedures may be energy-efficient implemented only on specialized programmed or configurable computing structures on FPGA/VLSI or in the form of CPU/GPU blocks.


Read more ...

210-core processor on FPGA Xilinx Virtex7 has been built


Recently we’ve finished the assembling and debugging of a new monster - 210-core processor prototype on FPGA Xilinx Virtex7 2000T. This is the biggest chip in the 7th generation of Xilinx FPGA. And our MALT system is the largest array of independent 32-bit RISC cores prototyped on a single FPGA known today.


Read more ...

A new smart memory controller with DMA support has been developed


A new version of the smart memory controller has been developed and tested. Now the controller maintains block data transfers - usually this calls ‘DMA’ - a system, enabling to copy data from one memory area to another without involving processing cores. The usage of the mechanism allows to increase performance of intensive data exchange tasks in several times. There is a basic set of "real" atomic operations in the controller, for instance, atomic increment is supported.


Read more ...



ISC’14 was held June 22-26 in Leipzig, Germany. ISC is the world’s oldest and the most significant high-performance computing conference and exhibition in Europe for the global HPC community.


Read more ...

The developed architecture is named ‘MALT’


We’ve been thinking long and hard over what name to give our architecture. Eventually, after long deliberation, we’ve decided to name it ‘MALT’ - Manycore Architecture with Lightweight Threads.


Read more ...

49-core processor on Xilinx Virtex6 FPGA with lightweight-thread support has been built


This is a significant milestone in the history of the project development. A multi-threaded processor, containing 49 RISC cores, has been implemented on Xilinx Virtex6 FPGA. The architecture, completely redesigned since 10-core prototype, provides the ability to effectively load dozens and hundreds of simple computing cores without conflicts and excessive overhead expenses.


Read more ...

Supercomputer summer classes


Supercomputer summer classes are held at the Faculty of Computational Mathematics and Cybernetics of the Academy, SRCC, REC "Supercomputer technology" from June 24 to July 6, 2013.


Read more ...