Industrial Talks and Tutorials

Industrial Talks (November 12, Monday, 2018)

[Industrial Talk 1] (13:00~14:00)
Arm Cortex CPU cores and technology portfolio

Shean Chung
Senior FAE Manager, ARM, Korea  



Shean Chung has been working in Arm as an FAE and an FAE manager since 2006, in order to deploy Arm technologies as well as to enable many kinds of Arm products. Prior to Arm, he had taken rich experiences for over 10 years in software engineering on several architectures and systems, and he possesses a wide range of technical experience on heterogeneous platforms crossing from embedded devices to enterprise machines. He has recently been interested in heterogeneous system design-in for best efficient performance at edge devices.




Computing has now become a central part of our everyday life, smartphones have enabled computing everywhere from smallest devices such as smartwatches and VR goggles to smart homes and smart cars. In the presentation, we can see that Arm Cortex processors are from a single Arm architecture but have been differently implemented and widely used for a diverse range of applications, and Arm compute technology is deployed throughout the ecosystem – beyond mobile, from sensor to cloud.



[Industrial Talk 2] (14:00~15:00)
Efficient inference and training of deep neural networks in limited precision

Jun Haeng Lee
Research Master at Samsung Advanced Institute of Technology (SAIT), Korea



Jun Haeng Lee is Research Master at Samsung Advanced Institute of Technology (SAIT), where he has been working on neuromorphic engineering, deep learning, and neural processors since 2009. Lee received his undergraduate degree (1999), master degree (2001), and Ph.D. (2005) in Electrical Engineering from Korea Advanced Institute of Science and Technology (KAIST). Before he joined SAIT, he was a research staff at KDDI R&D Labs. He was a visiting researcher at Institute of Neuroinformatics (INI) from Feb. 2015 to Feb. 2017. His work primarily focuses on designing efficient deep neural networks for low precision inference engines.



It is a challenging task to deploy state-of-the-art deep neural networks (DNNs) to embedded systems due to the inherent nature of huge number of computations and large memory requirements. These impediments are partly caused by the considerable amount of the redundancies found in the network parameters intended for ease of training. Thus, there are abundant opportunities for trimming strategies, such as pruning and quantizing to low precision. I will introduce techniques for converting DNNs trained in full precision into low-precision accelerator friendly formats. I will also review ideas enabling training of DNNs in low precision computing units.




[Industrial Talk 3] (15:00~16:00)

Memory Subsystem design for Consumer Electronics

Ken Kyuseok Cho
DDR PHY Professional, MID Team, SIC Center, CTO, LG Electronics, Korea





Ken Kyuseok Cho has been working on high performance and low power memory interfacing IP design for SoC, memory circuit design engineering since 1995. Experienced deep sub-micron CMOS technology including 16/14/12/10nm finfet technology, full custom analog-digital mixed signal designs and ASIC FE/BE design flows. Before joining LG Electronics as an IP design engineer, he worked as DRAM circuit designer and program manager for 18 years in several major DRAM companies in 3 continents. Ken Kyuseok Cho holds B.S. in ECE and M.S. in Semiconductor Engineering from Korea University.



Consumer electronic devices with UHD and 8K displays, Artificial Intelligence enabled intelligent applications require very high performance, cost effective and low power memory subsystem design. Various challenges on memory subsystem design of these Smart TV SoC and Intelligent Consumer devices will be introduced and ideas to solve these problems to be explained.



Tutorials (November 12, Monday, 2018)

[Tutorial 1] (16:15~17:45)

AI Processor with Nano Core-In-Memory Architecture for Function-Safe Autonomous Driving 

Youngsu Kwon

Group Leader, AI Processor Research Group, Electronics and Telecommunications Research Institute (ETRI), Korea




Youngsu Kwon received B.S., M.S., and Ph.D. degrees from Korea Advanced Institute of Science and Technology (KAIST), Republic of Korea at 1997, 1999, and 2004, respectively. He had been with Microsystems Technology Laboratory (MTL), Massachusetts Institute of Technology as a Postdoctoral Associate from 2004 to 2005 for designing 3-Dimensional FPGA. He is now Group Leader of AI Processor Research Group, Intelligent SoC Research Department, Electronics and Telecommunications Research Institute (ETRI), Republic of Korea since 2005. In ETRI, he is leading the design of Korean AI Processor, Aldebaran. He has authored over 30 internal journal and conference papers with special interest in low-power processor core design, many-core architecture, CAD and algorithmic optimizations of circuits and systems. He received Government Recognition Award for Science and Technology in 2016, Excellent Researcher Award from Korea Research Council in 2013, Industrial Contributor Award from Korean Federation of SMEs in 2013, and medals from Samsung Humantech Thesis Prize. The Aldebaran CPU core and Application processor for which he acts as a leading architect received the Presidential Award of Korean Semiconductor Design Competition in 2016.



State-of-the-art neural network accelerators consist of arithmetic engines organized in a mesh structure datapath surrounded by memory blocks that provide neural data to the datapath. While server-based accelerators coupled with server-class processors are accommodated with large silicon area and consume large amounts of power, electronic control units in autonomous driving vehicles require power-optimized AI processors with a small footprint. An AI processor for mobile applications that integrates general-purpose processor cores with mesh-structured neural network accelerators and high speed memory while achieving high-performance with low-power and compact area constraints necessitates designing a novel AI processor architecture. In this tutorial, we present the design of an AI processor for electronic systems in autonomous driving vehicles targeting not only CNN-based object recognition but also MLP-based in-vehicle voice recognition. The AI processor integrates Super-Thread-Cores (STC) for neural network acceleration with function-safe general purpose cores that satisfy vehicular electronics safety requirements. The STC is composed of tens of thousands of programmable nano-cores organized in a mesh-grid structured datapath network. Designed based on thorough analysis of neural network computations, the nano-core-in-memory architecture enhances computation intensity of STC with efficient feeding of multi-dimensional activation and kernel data into the nano-cores. The quad function-safe general purpose cores ensure functional safety of Super-Thread-Core to comply with road vehicle safety standard ISO 26262. The designed AI processor exhibits 32 Tera FLOPS, enabling hyper real-time execution of CNN, RNN, and FCN.



[Tutorial 2] (16:15~17:45)
Benchmarking Advanced CMOS and Beyond-CMOS Technologies

Andrew Marshall
Research Professor, Department of Electrical and Computer Engineering, The University of Texas at Dallas, USA




Dr. Andrew Marshall is a research professor at the University of Texas at Dallas, where he specializes in advanced CMOS, Analog Security and beyond CMOS benchmarking. He was with Texas Instruments for 27 years, leading teams developing high voltage and current devices, analog IC design, and power integrated circuits, at technology nodes from 10µm to 20nm. Dr. Marshall also worked on benchmarking of semiconductors IC processes, including performance characteristics of MOS and passive devices. During this time he attained the Texas Instruments Fellow (TI Fellow) technical rank.

He has authored or co-authored over 100 papers in conferences, peer reviewed journals and proceedings, and holds 85 issued patents. Dr. Marshall is a Fellow of the United Kingdom Institute of Physics, and a Fellow of the IEEE.


Benchmarking and performance closure have become increasingly important with continued feature size reduction of ICs. The aim of this tutorial is to explain the history of benchmarking, beginning with CMOS logic, and describing the evolution of changes and additions to the methodology as CMOS density has increased.
Development of each advanced CMOS is more difficult and expensive than the prior one, and to extend the performance characteristics of planar CMOS technology there have been many efforts to create new technologies. Some of these are CMOS extensions, such as Finfet devices. Others are so-called beyond CMOS devices, which include charge-based logic such as Tunnel FET based systems, others are non-charge based, which include nano-magnetic structures, spintronics device and quantum structures. 
All these need to be benchmarked and evaluated against each other, as it is important to determine which technologies offer advantages over conventional CMOS. Therefore there has developed a need for comparative benchmarking across logic families. We here detail benchmarking of CMOS and of some of the newer beyond CMOS technologies, and consider how benchmarking standards have changed by the addition of beyond CMOS capability.



Short Tutorials (November 14, Wednesday, 2018)

[Short Tutorial 1] (10:45~11:30)
Basics of Jitter Analysis

Jae-Yoon Sim
Professor, Department of Electrical Engineering, Pohang University of Science and Technology (POSTECH), Korea




Jae-Yoon Sim received the B.S., M.S., and Ph.D. degrees in electrical engineering from Pohang University of Science and Technology (POSTECH), Korea, in 1993, 1995, and 1999, respectively. From 1999 to 2005, he was a Senior Engineer in the Samsung Electronics, Korea. From 2003 to 2005, he was a Postdoctoral Researcher at the University of Southern California, USA. From 2011 to 2012, he was a Visiting Scholar at the University of Michigan, Ann Arbor, MI, USA. In 2005, he joined POSTECH, where he is currently a Professor. His research interests include clock generation, serial and parallel links, data converters, neuromorphic circuits and sensor interface circuits. He has served in the Technical Program Committees of the IEEE International Solid-State Circuits Conference, Symposium on VLSI Circuits, and Asian Solid-State Circuits Conference. He is a Distinguished Professor nominated by Korea Institute of Science and Technology. He is an IEEE Distinguished Lecturer from 2018. He was a recipient of the Takuo Sugano Award and Special Author-Recognition Award at ISSCC 2001 and 2013, respectively.




Jitter, as the temporal noise, is a general indicator for quality of timing generation. There are a number of definitions of measuring the amount of jitter, and they differently affect performance depending on applications. This tutorial reviews basics of jitter analysis and design considerations for each case of frequency generators such as oscillator, phase-locked loop and clock/data recovery loop in serial link.




[Short Tutorial 2] (11:30~12:15)

Minimum-Energy-Driven Integrated Circuits Design for Green Electronics

Tony Tae-Hyoung Kim
Associate Professor, School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore




Tony Tae-Hyoung Kim received the B.S. and M.S. degrees in electrical engineering from Korea University, Seoul, Korea, in 1999 and 2001, respectively. He received the Ph.D. degree in electrical and computer engineering from University of Minnesota, Minneapolis, MN, USA in 2009. From 2001 to 2005, he worked for Samsung Electronics where he performed research on the design of high-speed SRAM memories, clock generators, and IO interface circuits. In 2007 ~ 2009 summer, he was with IBM T. J. Watson Research Center and Broadcom Corporation where he performed research on circuit reliability, low power SRAM, and battery backed memory design, respectively. On November 2009, he joined Nanyang Technological University where he is currently an associate professor.
He received “Best Demo Award” ay APCCAS2016, “Low Power Design Contest Award” at ISLPED2016, best paper awards at 2014 and 2011 ISOCC, “AMD/CICC Student Scholarship Award” at IEEE CICC2008, Departmental Research Fellowship from Univ. of Minnesota in 2008, “DAC/ISSCC Student Design Contest Award” in 2008, “Samsung Humantec Thesis Award” in 2008, 2001, and 1999, and “ETRI Journal Paper of the Year Award” in 2005. He is an author/co-author of +140 journal and conference papers and has 17 US and Korean patents registered. His current research interests include low power and high performance digital, mixed-mode, and memory circuit design, ultra-low voltage circuits and systems design, variation and aging tolerant circuits and systems, and circuit techniques for 3D ICs. He serves as an Associate Editor of IEEE Transactions on VLSI Systems. He is an IEEE senior member and the Chair of IEEE Solid-State Circuits Society Singapore Chapter. He has served numerous conferences as a committee member.




Recently, various ultra-low power applications such as Internet-of-Things (IoT), wearable devices, and biomedical devices have emerged opening up a new domain of integrated circuits design. In these applications, ultra-low voltage circuit techniques for improving the power and energy efficiencies have been the main research focus. While supply voltage scaling has been considered as the most effective way of achieving high energy efficiency, it generates many challenging design issues such as significantly degraded parametric margins, large variations, etc. This tutorial will provide a brief introduction of various integrated circuits design techniques that are essential for minimum-energy-driven.