IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

Special Section on VLSI Design and CAD Algorithms

FOREWORD

Shinji KIMURA

2009 Volume E92.A Issue 12 Pages 2961
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.2961

JOURNAL RESTRICTED ACCESS

Download PDF (55K)
Practical Redundant-Via Insertion Method Considering Manufacturing Variability and Reliability

Yuji TAKASHIMA, Kazuyuki OOYA, Atsushi KUROKAWA

Article type: PAPER
Subject area: Physical Level Design
2009 Volume E92.A Issue 12 Pages 2962-2970
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.2962

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

As the integrated circuit technology has undergone continuous downscaling to improve the LSI performance and reduce chip size, design for manufacturability (DFM) and design for yield (DFY) have become very important. As one of the DFM/DFY methods, a redundant via insertion technique uses as many vias as possible to connect the metal wires between different layers. In this paper, we focus on redundant vias and propose an effective redundant via insertion method for practical use to address the manufacturing variability and reliability concerns. First, the results of statistical analysis for via resistance and via capacitance in some real physical layouts are shown, and the impact on circuit delay of the resistance variation of vias caused by manufacturing variability is clarified. Then, the valuation functions of delay variation, electro-migration (EM), and stress-migration (SM) are defined and a practical method concerning redundant via insertion is proposed. Experimental results show that LSI with redundant vias inserted by our method robust against manufacturing variability and reliability problems.

View full abstract

Download PDF (492K)
A Fast Longer Path Algorithm for Routing Grid with Obstacles Using Biconnectivity Based Length Upper Bound

Yukihide KOHIRA, Suguru SUEHIRO, Atsushi TAKAHASHI

Article type: PAPER
Subject area: Physical Level Design
2009 Volume E92.A Issue 12 Pages 2971-2978
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.2971

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

In recent VLSI systems, signal propagation delays are requested to achieve the specifications with very high accuracy. In order to meet the specifications, the routing of a net often needs to be detoured in order to increase the routing delay. A routing method should utilize a routing area with obstacles as much as possible in order to realize the specifications of nets simultaneously. In this paper, a fast longer path algorithm that generates a path of a net in routing grid so that the length is increased as much as possible is proposed. In the proposed algorithm, an upper bound for the length in which the structure of a routing area is taken into account is used. Experiments show that our algorithm utilizes a routing area with obstacles efficiently.

View full abstract

Download PDF (752K)
Thermal-Aware Incremental Floorplanning for 3D ICs Based on MILP Formulation

Yuchun MA, Xin LI, Yu WANG, Xianlong HONG

Article type: PAPER
Subject area: Physical Level Design
2009 Volume E92.A Issue 12 Pages 2979-2989
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.2979

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

In 3D IC design, thermal issue is a critical challenge. To eliminate hotspots, physical layouts are always adjusted by some incremental changes, such as shifting or duplicating hot blocks. In this paper, we distinguish the thermal-aware incremental changes in three different categories: migrating computation, growing unit and moving hotspot blocks. However, these modifications may degrade the packing area as well as interconnect distribution greatly. In this paper, mixed integer linear programming (MILP) models are devised according to these different incremental changes so that multiple objectives can be optimized simultaneously. Furthermore, to avoid random incremental modification, which may be inefficient and need long runtime to converge, here potential gain is modeled for each candidate incremental change. Based on the potential gain, a novel thermal optimization flow to intelligently choose the best incremental operation is presented. Experimental results show that migrating computation, growing unit and moving hotspot can reduce max on-chip temperature by 7%, 13% and 15% respectively on MCNC/GSRC benchmarks. Still, experimental results also show that the thermal optimization flow can reduce max on-chip temperature by 14% to the initial packings generated by an existing 3D floorplanning tool CBA, and achieve better area and total wirelength improvement than individual operations do. The results with the initial packings from CBA_T (Thermal-aware CBA floorplanner) show that 13.5% temperature reduction can be obtained by our incremental optimization flow.

View full abstract

Download PDF (538K)
Voltage and Level-Shifter Assignment Driven Floorplanning

Bei YU, Sheqin DONG, Song CHEN, Satoshi GOTO

Article type: PAPER
Subject area: Physical Level Design
2009 Volume E92.A Issue 12 Pages 2990-2997
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.2990

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Low Power Design has become a significant requirement when the CMOS technology entered the nanometer era. Multiple-Supply Voltage (MSV) is a popular and effective method for both dynamic and static power reduction while maintaining performance. Level shifters may cause area and Interconnect Length Overhead (ILO), and should be considered at both floorplanning and post-floorplanning stages. In this paper, we propose a two phases algorithm framework, called VLSAF, to solve voltage and level shifter assignment problem. At floorplanning phase, we use a convex cost network flow algorithm to assign voltage and a minimum cost flow algorithm to handle level-shifter assignment. At post-floorplanning phase, a heuristic method is adopted to redistribute white spaces and calculate the positions and shapes of level shifters. The experimental results show VLSAF is effective.

View full abstract

Download PDF (514K)
MILP-Based Efficient Routing Method with Restricted Route Structure for 2-Layer Ball Grid Array Packages

Yoichi TOMIOKA, Yoshiaki KURATA, Yukihide KOHIRA, Atsushi TAKAHASHI

Article type: PAPER
Subject area: Physical Level Design
2009 Volume E92.A Issue 12 Pages 2998-3006
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.2998

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

In this paper, we propose a routing method for 2-layer ball grid array packages that generates a routing pattern satisfying a design rule. In our proposed method, the routing structure on each layer is restricted while keeping most of feasible patterns to efficiently obtain a feasible routing pattern. A routing pattern that satisfies the design rule is formulated as a mixed integer linear programming. In experiments with seven data, we obtain a routing pattern such that satisfies the design rule within a practical time by using a mixed integer linear programming solver.

View full abstract

Download PDF (482K)
Intra-Die Spatial Correlation Extraction with Maximum Likelihood Estimation Method for Multiple Test Chips

Qiang FU, Wai-Shing LUK, Jun TAO, Xuan ZENG, Wei CAI

Article type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages 3007-3015
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3007

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

In this paper, a novel intra-die spatial correlation extraction method referred to as MLEMTC (Maximum Likelihood Estimation for Multiple Test Chips) is presented. In the MLEMTC method, a joint likelihood function is formulated by multiplying the set of individual likelihood functions for all test chips. This joint likelihood function is then maximized to extract a unique group of parameter values of a single spatial correlation function, which can be used for statistical circuit analysis and design. Moreover, to deal with the purely random component and measurement error contained in measurement data, the spatial correlation function combined with the correlation of white noise is used in the extraction, which significantly improves the accuracy of the extraction results. Furthermore, an LU decomposition based technique is developed to calculate the log-determinant of the positive definite matrix within the likelihood function, which solves the numerical stability problem encountered in the direct calculation. Experimental results have shown that the proposed method is efficient and practical.

View full abstract

Download PDF (425K)
An Approach for Reducing Leakage Current Variation due to Manufacturing Variability

Tsuyoshi SAKATA, Takaaki OKUMURA, Atsushi KUROKAWA, Hidenari NAKASHIMA ...

Article type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages 3016-3023
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3016

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Leakage current is an important qualitative metric of LSI (Large Scale Integrated circuit). In this paper, we focus on reduction of leakage current variation under the process variation. Firstly, we derive a set of quadratic equations to evaluate delay and leakage current under the process variation. Using these equations, we discuss the cases of varying leakage current without degrading delay distribution and propose a procedure to reduce the leakage current variations. From the experiments, we show the proposed method effectively reduces the leakage current variation up to 50% at 90 percentile point of the distribution compared with the conventional design approach.

View full abstract

Download PDF (1114K)
A Modified Nested Sparse Grid Based Adaptive Stochastic Collocation Method for Statistical Static Timing Analysis

Xu LUO, Fan YANG, Xuan ZENG, Jun TAO, Hengliang ZHU, Wei CAI

Article type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages 3024-3034
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3024

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

In this paper, we propose a Modified nested sparse grid based Adaptive Stochastic Collocation Method (MASCM) for block-based Statistical Static Timing Analysis (SSTA). The proposed MASCM employs an improved adaptive strategy derived from the existing Adaptive Stochastic Collocation Method (ASCM) to approximate the key operator MAX during timing analysis. In contrast to ASCM which uses non-nested sparse grid and tensor product quadratures to approximate the MAX operator for weakly and strongly nonlinear conditions respectively, MASCM proposes a modified nested sparse grid quadrature to approximate the MAX operator for both weakly and strongly nonlinear conditions. In the modified nested sparse grid quadrature, we firstly construct the second order quadrature points based on extended Gauss-Hermite quadrature and nested sparse grid technique, and then discard those quadrature points that do not contribute significantly to the computation accuracy to enhance the efficiency of the MAX approximation. Compared with the non-nested sparse grid quadrature, the proposed modified nested sparse grid quadrature not only employs much fewer collocation points, but also offers much higher accuracy. Compared with the tensor product quadrature, the modified nested sparse grid quadrature greatly reduced the computational cost, while still maintains sufficient accuracy for the MAX operator approximation. As a result, the proposed MASCM provides comparable accuracy while remarkably reduces the computational cost compared with ASCM. The numerical results show that with comparable accuracy MASCM has 50% reduction in run time compared with ASCM.

View full abstract

Download PDF (436K)
Find the ‘Best’ Solution from Multiple Analog Topologies via Pareto-Optimality

Yu LIU, Masato YOSHIOKA, Katsumi HOMMA, Toshiyuki SHIBUYA

Article type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages 3035-3043
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3035

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

This paper presents a novel method using multi-objective optimization algorithm to automatically find the best solution from a topology library of analog circuits. Firstly this method abstracts the Pareto-front of each topology in the library by SPICE simulation. Then, the Pareto-front of the topology library is abstracted from the individual Pareto-fronts of topologies in the library followed by the theorem we proved. The best solution which is defined as the nearest point to specification on the Pareto-front of the topology library is then calculated by the equations derived from collinearity theorem. After the local searching using Nelder-Mead method maps the calculated best solution backs to design variable space, the non-dominated best solution is obtained. Comparing to the traditional optimization methods using single-objective optimization algorithms, this work can efficiently find the best non-dominated solution from multiple topologies for different specifications without additional time-consuming optimizing iterations. The experiments demonstrate that this method is feasible and practical in actual analog designs especially for uncertain or variant multi-dimensional specifications.

View full abstract

Download PDF (993K)
Design of Voltage-Mode MAX-MIN Circuits with Low Area and Low Power Consumption

Mohammad SOLEIMANI, Abdollah KHOEI, Khayrollah HADIDI, Vahid Fagih DIN ...

Article type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages 3044-3051
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3044

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

In this paper, new structure of Voltage-Mode MAX-MIN circuit are presented for nonlinear systems, fuzzy applications, neural network and etc. A differential pair with improved cascode current mirror is used to choose the desired input. The advantages of the proposed structure are high operating frequency, high precision, low power consumption, low area and simple expansion for multiple inputs by adding only three transistors for each extra input. The proposed circuit which is simulated by HSPICE in 0.35µm CMOS process shows the total power consumption of 85µW in 5MHz operating frequency from a single 3.3-V supply. Also, the total area of the proposed circuit is about 420µm² for two input voltages, and would be negligibly increased for each extra input.

View full abstract

Download PDF (884K)
Fast Shape Optimization of Metalization Patterns for Power-MOSFET Based Driver

Bo YANG, Shigetoshi NAKATAKE

Article type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages 3052-3060
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3052

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

This paper addresses the problem of optimizing metalization patterns of back-end connections for the power-MOSFET based driver since the back-end connections tend to dominate the on-resistance R_on of the driver. We propose a heuristic algorithm to seek for better geometric shapes for the patterns targeting at minimizing R_on and at balancing the current distribution. In order to speed up the analysis, the equivalent resistance network of the driver is modified by inserting ideal switches to avoid repeatedly inverting the admittance matrix. With the behavioral model of the ideal switch, we can significantly accelerate the optimization. Simulation on three drivers from industrial TEG data demonstrates that our algorithm can reduce R_on effectively by shaping metals appropriately within a given routing area.

View full abstract

Download PDF (368K)
Fast Analysis of On-Chip Power Grid Circuits by Extended Truncated Balanced Realization Method

Duo LI, Sheldon X.-D. TAN

Article type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages 3061-3069
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3061

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

In this paper, we present a novel analysis approach for large on-chip power grid circuit analysis. The new approach, called ETBR for extended truncated balanced realization, is based on model order reduction techniques to reduce the circuit matrices before the simulation. Different from the (improved) extended Krylov subspace methods EKS/IEKS [2],[3], ETBR performs fast truncated balanced realization on response Gramian to reduce the original system. ETBR also avoids the adverse explicit moment representation of the input signals. Instead, it uses spectrum representation in frequency domain for input signals by fast Fourier transformation. The proposed method is very amenable for threading-based parallel computing, as the response Gramian is computed in a Monte-Carlo-like sampling style and each sampling can be computed in parallel. This contrasts with all the Krylov subspace based methods like the EKS method, where moments have to be computed in a sequential order. ETBR is also more flexible for different types of input sources and can better capture the high frequency contents than EKS, and this leads to more accurate results especially for fast changing input signals. Experimental results on a number of large networks (up to one million nodes) show that, given the same order of the reduced model, ETBR is indeed more accurate than the EKS method especially for input sources rich in high-frequency components. If parallel computing is explored, ETBR can be an order of magnitude faster than the EKS/IEKS method.

View full abstract

Download PDF (887K)
Statistical Gate Delay Model for Multiple Input Switching

Takayuki FUKUOKA, Akira TSUCHIYA, Hidetoshi ONODERA

Article type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages 3070-3078
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3070

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

In this paper, we propose a calculation method of gate delay for SSTA (Statistical Static Timing Analysis) considering MIS (Multiple Input Switching). In SSTA, statistical maximum/minimum operation is necessary to calculate the latest/fastest arrival time of multiple input gate. Most SSTA approaches calculate the distribution in the latest/fastest arrival time under SIS (Single Input Switching assumption), resulting in ignoring the effect of MIS on the gate delay and the output transition time. MIS occurs when multiple inputs of a gate switch nearly simultaneously. Thus, ignoring MIS causes error in the statistical maximum/minimum operation in SSTA. We propose a statistical gate delay model considering MIS. We verify the proposed method by SPICE based Monte Carlo simulations. Experimental results show that the neglect of MIS effect leads to 80% error in worst case. The error of the proposed method is less than 20%.

View full abstract

Download PDF (443K)
Low-Voltage Process-Compensated VCO with On-Chip Process Monitoring and Body-Biasing Circuit Techniques

Ken UENO, Tetsuya HIROSE, Tetsuya ASAI, Yoshihito AMEMIYA

Article type: LETTER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages 3079-3081
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3079

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

A voltage-controlled oscillator (VCO) tolerant to process variations at lower supply voltage was proposed. The circuit consists of an on-chip threshold-voltage-monitoring circuit, a current-source circuit, a body- biasing control circuit, and the delay cells of the VCO. Because variations in low-voltage VCO frequency are mainly determined by that of the current in delay cells, a current-compensation technique was adopted by using an on-chip threshold-voltage-monitoring circuit and body-biasing circuit techniques. Monte Carlo SPICE simulations demonstrated that variations in the oscillation frequency by using the proposed techniques were able to be suppressed about 65% at a 1-V supply voltage, compared to frequencies with and without the techniques.

View full abstract

Download PDF (315K)
Accurate Systematic Hot-Spot Scoring Method and Score-Based Fixing Guidance Generation

Yonghee PARK, Junghoe CHOI, Jisuk HONG, Sanghoon LEE, Moonhyun YOO, Ju ...

Article type: LETTER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages 3082-3085
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3082

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

The researches on predicting and removing of lithographic hot-spots have been prevalent in recent semiconductor industries, and known to be one of the most difficult challenges to achieve high quality detection coverage. To provide physical design implementation with designer's favors on fixing hot-spots, in this paper, we present a noble and accurate hot-spot detection method, so-called “leveling and scoring” algorithm based on weighted combination of image quality parameters (i.e., normalized image log-slope (NILS), mask error enhancement factor (MEEF), and depth of focus (DOF)) from lithography simulation. In our algorithm, firstly, hot-spot scoring function considering severity level is calibrated with process window qualification, and then least-square regression method is used to calibrate weighting coefficients for each image quality parameter. In this way, after we obtain the scoring function with wafer results, our method can be applied to future designs of using the same process. Using this calibrated scoring function, we can successfully generate fixing guidance and rule to detect hot-spot area by locating edge bias value which leads to a hot-spot-free score level. Finally, we integrate the hot-spot fixing guidance information into layout editor to facilitate the user-favorable design environment. Applying our method to memory devices of 60nm node and below, we could successfully attain sufficient process window margin to yield high mass production.

View full abstract

Download PDF (589K)
Constrained Stimulus Generation with Self-Adjusting Using Tabu Search with Memory

Yanni ZHAO, Jinian BIAN, Shujun DENG, Zhiqiu KONG, Kang ZHAO

Article type: PAPER
Subject area: Logic Synthesis, Test and Verification
2009 Volume E92.A Issue 12 Pages 3086-3093
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3086

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Despite the growing research effort in formal verification, industrial verification often relies on the constrained random simulation methodology, which is supported by constraint solvers as the stimulus generator integrated within simulator, especially for the large design with complex constraints nowadays. These stimulus generators need to be fast and well-distributed to maintain simulation performance. In this paper, we propose a dynamic method to guide stimulus generation by SAT solvers. An adjusting strategy named Tabu Search with Memory (TSwM) is integrated in the stimulus generator for the search and prune processes along with the constraint solver. Experimental results show that the method proposed in this paper could generate well-distributed stimuli with good performance.

View full abstract

Download PDF (302K)
Trade-Off Analysis between Timing Error Rate and Power Dissipation for Adaptive Speed Control with Timing Error Prediction

Hiroshi FUKETA, Masanori HASHIMOTO, Yukio MITSUYAMA, Takao ONOYE

Article type: PAPER
Subject area: Logic Synthesis, Test and Verification
2009 Volume E92.A Issue 12 Pages 3094-3102
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3094

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Timing margin of a chip varies chip by chip due to manufacturing variability, and depends on operating environment and aging. Adaptive speed control with timing error prediction is promising to mitigate the timing margin variation, whereas it inherently has a critical risk of timing error occurrence when a circuit is slowed down. This paper presents how to evaluate the relation between timing error rate and power dissipation in self-adaptive circuits with timing error prediction. The discussion is experimentally validated using adders in subthreshold operation in a 90nm CMOS process. We show a trade-off between timing error rate and power dissipation, and reveal the dependency of the trade-off on design parameters.

View full abstract

Download PDF (613K)
Incremental Buffer Insertion and Module Resizing Algorithm Using Geometric Programming

Qing DONG, Bo YANG, Jing LI, Shigetoshi NAKATAKE

Article type: PAPER
Subject area: Logic Synthesis, Test and Verification
2009 Volume E92.A Issue 12 Pages 3103-3110
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3103

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

This paper presents an efficient algorithm for incremental buffer insertion and module resizing for a full-placed floorplan. Our algorithm offers a method to use the white space in a given floorplan to resize modules and insert buffers, and at the same time keeps the resultant floorplan as close to the original one as possible. Both the buffer insertion and module resizing are modeled as geometric programming problems, and can be solved extremely efficiently using new developed solution methods. The experimental results suggest that the the wire length difference between the initial floorplan and result are quite small (less than 5%), and the global structure of the initial floorplan are preserved very well.

View full abstract

Download PDF (338K)
Optimizing Controlling-Value-Based Power Gating with Gate Count and Switching Activity

Lei CHEN, Shinji KIMURA

Article type: PAPER
Subject area: Logic Synthesis, Test and Verification
2009 Volume E92.A Issue 12 Pages 3111-3118
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3111

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

In this paper, a new heuristic algorithm is proposed to optimize the power domain clustering in controlling-value-based (CV-based) power gating technology. In this algorithm, both the switching activity of sleep signals (p) and the overall numbers of sleep gates (gate count, N) are considered, and the sum of the product of p and N is optimized. The algorithm effectively exerts the total power reduction obtained from the CV-based power gating. Even when the maximum depth is kept to be the same, the proposed algorithm can still achieve power reduction approximately 10% more than that of the prior algorithms. Furthermore, detailed comparison between the proposed heuristic algorithm and other possible heuristic algorithms are also presented. HSPICE simulation results show that over 26% of total power reduction can be obtained by using the new heuristic algorithm. In addition, the effect of dynamic power reduction through the CV-based power gating method and the delay overhead caused by the switching of sleep transistors are also shown in this paper.

View full abstract

Download PDF (563K)
X-Handling for Current X-Tolerant Compactors with More Unknowns and Maximal Compaction

Youhua SHI, Nozomu TOGAWA, Masao YANAGISAWA, Tatsuo OHTSUKI

Article type: PAPER
Subject area: Logic Synthesis, Test and Verification
2009 Volume E92.A Issue 12 Pages 3119-3127
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3119

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

This paper presents a novel X-handling technique, which removes the effect of unknowns on compacted test response with maximal compaction ratio. The proposed method combines with the current X-tolerant compactors and inserts masking cells on scan paths to selectively mask X's. By doing this, the number of unknown responses in each scan-out cycle could be reduced to a reasonable level such that the target X-tolerant compactor would tolerate with guaranteed possible error detection. It guarantees no test loss due to the effect of X's, and achieves the maximal compaction that the target response compactor could provide as well. Moreover, because the masking cells are only inserted on the scan paths, it has no performance degradation of the designs. Experimental results demonstrate the effectiveness of the proposed method.

View full abstract

Download PDF (852K)
Addressing Defect Coverage through Generating Test Vectors for Transistor Defects

Yoshinobu HIGAMI, Kewal K. SALUJA, Hiroshi TAKAHASHI, Shin-ya KOBAYASH ...

Article type: PAPER
Subject area: Logic Synthesis, Test and Verification
2009 Volume E92.A Issue 12 Pages 3128-3135
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3128

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Shorts and opens are two major kind of defects that are most likely to occur in Very Large Scale Integrated Circuits. In modern Integrated Circuit devices these defects must be considered not only at gate-level but also at transistor level. In this paper, we propose a method for generating test vectors that targets both transistor shorts (tr-shorts) and transistor opens (tr-opens). Since two consecutive test vectors need to be applied in order to detect tr-opens, we assume launch on capture (LOC) test application mechanism. This makes it possible to detect delay type defects. Further, the proposed method employs existing stuck-at test generation tools thus requiring no change in the design and development flow and development of no new tools is needed. Experimental results for benchmark circuits demonstrate the effectiveness of the proposed method by providing 100% fault efficiency while the test set size is still moderate.

View full abstract

Download PDF (300K)
An Error Diagnosis Technique Based on Location Sets to Rectify Subcircuits

Kosuke SHIOKI, Narumi OKADA, Toshiro ISHIHARA, Tetsuya HIROSE, Nobutak ...

Article type: PAPER
Subject area: Logic Synthesis, Test and Verification
2009 Volume E92.A Issue 12 Pages 3136-3142
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3136

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

This paper presents an error diagnosis technique for incremental synthesis, called EXL_LS (Extended X-algorithm for LUT-based circuit model based on Location sets to rectify Subcircuits), which rectifies five or more functional errors in the whole circuit based on location sets to rectify subcircuits. Conventional error diagnosis technique, called EXL_IT, tries to rectify five or more functional errors based on incremental rectification for subcircuits. However, the solution depends on the selection and the order of modifications on subcircuits, which increases the number of locations to be changed. To overcome this problem, we propose EXL_LS based on location sets to rectify subcircuits, which obtains two or more solutions by separating i) extraction of location sets to be rectified, and ii) rectification for the whole circuit based on the location sets. Thereby EXL_LS can rectify five or more errors with fewer locations to change. Experimental results have shown that EXL_LS reduces increase in the number of locations to be rectified with conventional technique by 90.1%.

View full abstract

Download PDF (927K)
Communication Synthesis for Interconnect Minimization in Multicycle Communication Architecture

Ya-Shih HUANG, Yu-Ju HONG, Juinn-Dar HUANG

Article type: PAPER
Subject area: High-Level Synthesis and System-Level Design
2009 Volume E92.A Issue 12 Pages 3143-3150
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3143

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

In deep-submicron technology, several state-of-the-art architectural synthesis flows have already adopted the distributed register architecture to cope with the increasing wire delay by allowing multicycle communication. In this article, we regard communication synthesis targeting a refined regular distributed register architecture, named RDR-GRS, as a problem of simultaneous data transfer routing and scheduling for global interconnect resource minimization. We also present an innovative algorithm with regard of both spatial and temporal perspectives. It features both a concentration-oriented path router gathering wire-sharable data transfers and a channel-based time scheduler resolving contentions for wires in a channel, which are in spatial and temporal domain, respectively. The experimental results show that the proposed algorithm can significantly outperform existing related works.

View full abstract

Download PDF (1201K)
Peak Temperature Reduction by Physical Information Driven Behavioral Synthesis with Resource Usage Allocation

Junbo YU, Qiang ZHOU, Gang QU, Jinian BIAN

Article type: PAPER
Subject area: High-Level Synthesis and System-Level Design
2009 Volume E92.A Issue 12 Pages 3151-3159
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3151

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

High temperature adversely impacts on circuit's reliability, performance, and leakage power. During behavioral synthesis, both resource usage allocation and resource binding influence thermal profile. Current thermal-aware behavioral syntheses do not utilize location information of resources from floorplan and in addition only focus on binding, ignoring allocation. This paper proposes thermal-aware behavioral synthesis with resource usage allocation. Based on a hybrid metric of physical location information and temperature, we rebind operations and reallocate the number of resources under area constraint. Our approach effectively controls peak temperature and creates even power densities among resources of different types and within resources of the same type. Experimental results show an average of 8.6°C drop in peak temperature and 5.3% saving of total power consumption with little latency overhead.

View full abstract

Download PDF (1067K)
Energy-Aware Memory Allocation Framework for Embedded Data-Intensive Signal Processing Applications

Florin BALASA, Ilie I. LUICAN, Hongwei ZHU, Doru V. NASUI

Article type: PAPER
Subject area: High-Level Synthesis and System-Level Design
2009 Volume E92.A Issue 12 Pages 3160-3168
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3160

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Many signal processing systems, particularly in the multimedia and telecommunication domains, are synthesized to execute data-intensive applications: their cost related aspects — namely power consumption and chip area — are heavily influenced, if not dominated, by the data access and storage aspects. This paper presents an energy-aware memory allocation methodology. Starting from the high-level behavioral specification of a given application, this framework performs the assignment of the multidimensional signals to the memory layers — the on-chip scratch-pad memory and the off-chip main memory — the goal being the reduction of the dynamic energy consumption in the memory subsystem. Based on the assignment results, the framework subsequently performs the mapping of signals into both memory layers such that the overall amount of data storage be reduced. This software system yields a complete allocation solution: the exact storage amount on each memory layer, the mapping functions that determine the exact locations for any array element (scalar signal) in the specification, and an estimation of the dynamic energy consumption in the memory subsystem.

View full abstract

Download PDF (402K)
Floorplan-Aware High-Level Synthesis for Generalized Distributed-Register Architectures

Akira OHCHI, Nozomu TOGAWA, Masao YANAGISAWA, Tatsuo OHTSUKI

Article type: PAPER
Subject area: High-Level Synthesis and System-Level Design
2009 Volume E92.A Issue 12 Pages 3169-3179
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3169

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

As device feature size decreases, interconnection delay becomes the dominating factor of circuit total delay. Distributed-register architectures can reduce the influence of interconnection delay. They may, however, increase circuit area because they require many local registers. Moreover original distributed-register architectures do not consider control signal delay, which may be the bottleneck in a circuit. In this paper, we propose a high-level synthesis method targeting generalized distributed-register architecture in which we introduce shared/local registers and global/local controllers. Our method is based on iterative improvement of scheduling/binding and floorplanning. First, we prepare shared-register groups with global controllers, each of which corresponds to a single functional unit. As iterations proceed, we use local registers and local controllers for functional units on a critical path. Shared-register groups physically located close to each other are merged into a single group. Accordingly, global controllers are merged. Finally, our method obtains a generalized distributed-register architecture where its scheduling/binding as well as floorplanning are simultaneously optimized. Experimental results show that the area is decreased by 4.7% while maintaining the performance of the circuit equal with that using original distributed-register architectures.

View full abstract

Download PDF (826K)
Low-Power Embedded Processor Design Using Branch Direction

Gi-Ho PARK, Jung-Wook PARK, Gunok JUNG, Shin-Dug KIM

Article type: LETTER
Subject area: High-Level Synthesis and System-Level Design
2009 Volume E92.A Issue 12 Pages 3180-3181
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3180

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

This paper presents a wordline gating logic for reducing unnecessary BTB accesses. Partial bit of the branch predictor was simultaneously recorded in the middle of BTB to prevent further SRAM operation. Experimental results with embedded applications showed that the proposed mechanism reduces around 38% of BTB power consumption.

View full abstract

Download PDF (345K)
Rapid Design Space Exploration of a Reconfigurable Instruction-Set Processor

Farhad MEHDIPOUR, Hamid NOORI, Koji INOUE, Kazuaki MURAKAMI

Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages 3182-3192
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3182

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Multitude parameters in the design process of a reconfigurable instruction-set processor (RISP) may lead to a large design space and remarkable complexity. Quantitative design approach uses the data collected from applications to satisfy design constraints and optimize the design goals while considering the applications' characteristics; however it highly depends on designer observations and analyses. Exploring design space can be considered as an effective technique to find a proper balance among various design parameters. Indeed, this approach would be computationally expensive when the performance evaluation of the design points is accomplished based on the synthesis-and-simulation technique. A combined analytical and simulation-based model (CAnSO) is proposed and validated for performance evaluation of a typical RISP. The proposed model consists of an analytical core that incorporates statistics collected from cycle-accurate simulation to make a reasonable evaluation and provide a valuable insight. CAnSO has clear speed advantages and therefore it can be used for easing a cumbersome design space exploration of a reconfigurable RISP processor and quick performance evaluation of slightly modified architectures.

View full abstract

Download PDF (1010K)
A System-Level Model of Design Space Exploration for a Tile-Based 3D Graphics SoC Refinement

Liang-Bi CHEN, Chi-Tsai YEH, Hung-Yu CHEN, Ing-Jer HUANG

Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages 3193-3202
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3193

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

3D graphics application is widely used in consumer electronics which is an inevitable tendency in the future. In general, the higher abstraction level is used to model a complex system like 3D graphics SoC. However, the concerned issue is that how to use efficient methods to traverse design space hierarchically, reduce simulation time, and refine the performance fast. This paper demonstrates a system-level design space exploration model for a tile-based 3D graphics SoC refinement. This model uses UML tools which can assist designers to traverse the whole system and reduces simulation time dramatically by adopting SystemC. As a result, the system performance is improved 198% at geometry function and 69% at rendering function, respectively.

View full abstract

Download PDF (1067K)
A 48Cycles/MB H.264/AVC Deblocking Filter Architecture for Ultra High Definition Applications

Dajiang ZHOU, Jinjia ZHOU, Jiayi ZHU, Satoshi GOTO

Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages 3203-3210
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3203

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

In this paper, a highly parallel deblocking filter architecture for H.264/AVC is proposed to process one macroblock in 48 clock cycles and give real-time support to QFHD@60fps sequences at less than 100MHz. 4 edge filters organized in 2 groups for simultaneously processing vertical and horizontal edges are applied in this architecture to enhance its throughput. While parallelism increases, pipeline hazards arise owing to the latency of edge filters and data dependency of deblocking algorithm. To solve this problem, a zig-zag processing schedule is proposed to eliminate the pipeline bubbles. Data path of the architecture is then derived according to the processing schedule and optimized through data flow merging, so as to minimize the cost of logic and internal buffer. Meanwhile, the architecture's data input rate is designed to be identical to its throughput, while the transmission order of input data can also match the zig-zag processing schedule. Therefore no intercommunication buffer is required between the deblocking filter and its previous component for speed matching or data reordering. As a result, only one 24×64 two-port SRAM as internal buffer is required in this design. When synthesized with SMIC 130nm process, the architecture costs a gate count of 30.2k, which is competitive considering its high performance.

View full abstract

Download PDF (2147K)
Worst-Case Flit and Packet Delay Bounds in Wormhole Networks on Chip

Yue QIAN, Zhonghai LU, Wenhua DOU

Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages 3211-3220
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3211

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

We investigate per-flow flit and packet worst-case delay bounds in on-chip wormhole networks. Such investigation is essential in order to provide guarantees under worst-case conditions in cost-constrained systems, as required by many hard real-time embedded applications. We first propose analysis models for flow control, link and buffer sharing. Based on these analysis models, we obtain an open-ended service analysis model capturing the combined effect of flow control, link and buffer sharing. With the service analysis model, we compute equivalent service curves for individual flows, and then derive their flit and packet delay bounds. Our experimental results verify that our analytical bounds are correct and tight.

View full abstract

Download PDF (1033K)
Low Cost Design of an Advanced Encryption Standard (AES) Processor Using a New Common-Subexpression-Elimination Algorithm

Ming-Chih CHEN, Shen-Fu HSIAO

Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages 3221-3228
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3221

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

In this paper, we propose an area-efficient design of Advanced Encryption Standard (AES) processor by applying a new common-expression-elimination (CSE) method to the sub-functions of various transformations required in AES. The proposed method reduces the area cost of realizing the sub-functions by extracting the common factors in the bit-level XOR/AND-based sum-of-product expressions of these sub-functions using a new CSE algorithm. Cell-based implementation results show that the AES processor with our proposed CSE method has significant area improvement compared with previous designs.

View full abstract

Download PDF (1010K)
A Scan-Based Attack Based on Discriminators for AES Cryptosystems

Ryuta NARA, Nozomu TOGAWA, Masao YANAGISAWA, Tatsuo OHTSUKI

Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages 3229-3237
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3229

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

A scan chain is one of the most important testing techniques, but it can be used as side-channel attacks against a cryptography LSI. We focus on scan-based attacks, in which scan chains are targeted for side-channel attacks. The conventional scan-based attacks only consider the scan chain composed of only the registers in a cryptography circuit. However, a cryptography LSI usually uses many circuits such as memories, micro processors and other circuits. This means that the conventional attacks cannot be applied to the practical scan chain composed of various types of registers. In this paper, a scan-based attack which enables to decipher the secret key in an AES cryptography LSI composed of an AES circuit and other circuits is proposed. By focusing on bit pattern of the specific register and monitoring its change, our scan-based attack eliminates the influence of registers included in other circuits than AES. Our attack does not depend on scan chain architecture, and it can decipher practical AES cryptography LSIs.

View full abstract

Download PDF (1193K)
A Two-Level Cache Design Space Exploration System for Embedded Applications

Nobuaki TOJO, Nozomu TOGAWA, Masao YANAGISAWA, Tatsuo OHTSUKI

Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages 3238-3247
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3238

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Recently, two-level cache, L1 cache and L2 cache, is commonly used in a processor. Particularly in an embedded system whereby a single application or a class of applications is repeatedly executed on a processor, its cache configuration can be customized such that an optimal one is achieved. An optimal two-level cache configuration can be obtained which minimizes overall memory access time or memory energy consumption by varying the three cache parameters: the number of sets, a line size, and an associativity, for L1 cache and L2 cache. In this paper, we first extend the L1 cache simulation algorithm so that we can explore two-level cache configuration. Second, we propose two-level cache design space exploration algorithms: CRCB-T1 and CRCB-T2, each of which is based on applying Cache Inclusion Property to two-level cache configuration. Each of the proposed algorithms realizes exact cache simulation but decreases the number of cache hit/miss judgments by a factor of several thousands. Experimental results show that, by using our approach, the number of cache hit/miss judgments required to optimize a cache configurations is reduced to 1/50-1/5500 compared to the exhaustive approach. As a result, our proposed approach totally runs an average of 1398.25 times faster compared to the exhaustive approach. Our proposed cache simulation approach achieves the world fastest two-level cache design space exploration.

View full abstract

Download PDF (523K)
Entropy Decoding Processor for Modern Multimedia Applications

Sumek WISAYATAKSIN, Dongju LI, Tsuyoshi ISSHIKI, Hiroaki KUNIEDA

Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages 3248-3257
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3248

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

An entropy decoding engine plays an important role in modern multimedia decoders. Previous researches that focused on the decoding performance paid a considerable attention to only one parameter such as the data parsing speed, but they did not consider the performance caused by a table configuration time and memory size. In this paper, we developed a novel method of entropy decoding based on the two step group matching scheme. Our approach achieves the high performance on both data parsing speed and configuration time with small memory needed. We also deployed our decoding scheme to implement an entropy decoding processor, which performs operations based on normal processor instructions and VLD instructions for decoding variable length codes. Several extended VLD instructions are prepared to increase the bitstream parsing process in modern multimedia applications. This processor provides a solution with software flexibility and hardware high speed for stand-alone entropy decoding engines. The VLSI hardware is designed by the Language for Instruction Set Architecture (LISA) with 23Kgates and 110MHz maximum clock frequency under TSMC 0.18µm technology. The experimental simulations revealed that proposed processor achieves the higher performance and suitable for many practical applications such as MPEG-2, MPEG-4, H.264/AVC and AAC.

View full abstract

Download PDF (789K)
Heuristic Instruction Scheduling Algorithm Using Available Distance for Partial Forwarding Processor

Takuji HIEDA, Hiroaki TANAKA, Keishi SAKANUSHI, Yoshinori TAKEUCHI, Ma ...

Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages 3258-3267
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3258

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Partial forwarding is a design method to place forwarding paths on a part of processor pipeline. Hardware cost of processor can be reduced without performance loss by partial forwarding. However, compiler with the instruction scheduler which considers partial forwarding structure of the target processor is required since conventional scheduling algorithm cannot make the most of partial forwarding structure. In this paper, we propose a heuristic instruction scheduling method for processors with partial forwarding structure. The proposed algorithm uses available distance to schedule instructions which are suitable for the target partial forwarding processor. Experimental results show that the proposed method generates near-optimal solutions in practical time and some of the optimized codes for partial forwarding processor run in the shortest time among the target processors. It also shows that the proposed method is superior to hazard detection unit.

View full abstract

Download PDF (562K)
Efficient Cut Enumeration Heuristics for Depth-Optimum Technology Mapping for LUT-Based FPGAs

Taiga TAKATA, Yusuke MATSUNAGA

Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages 3268-3275
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3268

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Recent technology mappers for LUT based FPGAs employ cut enumeration. Although many cuts are often needed to find a good network, enumerating all the cuts with large size consumes a lot of run-time. Existing algorithms employ the bottom-up merging which calculates Cartesian products of the fanins' cuts for each node. The number of cuts is much smaller than the size of the Cartesian products in most cases. Thus, the existing algorithms are inefficient. Furthermore, the number of cuts exponentially increases with the size of cuts, that makes the run-time much longer. Several algorithms to enumerate not all the cuts but partial cuts have been presented [8],[9], but they tend to disturb the quality of networks. This paper presents two algorithms to enumerate cuts; an exhaustive enumeration and a partial enumeration. Both of them are efficient because they do not employ the bottom-up merging. The partial enumeration reduces the number of enumerated cuts with a guarantee that a depth-minimum network can be constructed. The experimental results show that the exhaustive enumeration runs about 5 and 13 times faster than the existing bottom-up algorithm [12] for K=8, 9 respectively, while keeping the same results. On the other hand, the partial enumeration runs about 9 and 29 times faster than the existing algorithm for K =8, 9, respectively. The average area of networks derived by the sets of cuts enumerated by the partial enumeration is only 4% larger than that derived with using all the cuts, and the depth is the same.

View full abstract

Download PDF (341K)

Special Section on Image Media Quality

FOREWORD

Mitsuho YAMADA

2009 Volume E92.A Issue 12 Pages 3276
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3276

JOURNAL RESTRICTED ACCESS

Download PDF (59K)
Two Principles of High-Level Human Visual Processing Potentially Useful for Image and Video Quality Assessment

Shin'ya NISHIDA

Article type: INVITED PAPER
2009 Volume E92.A Issue 12 Pages 3277-3283
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3277

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Objective assessment of image and video quality should be based on a correct understanding of subjective assessment by human observers. Previous models have incorporated the mechanisms of early visual processing in image quality metrics, enabling us to evaluate the visibility of errors from the original images. However, to understand how human observers perceive image quality, one should also consider higher stages of visual processing where perception is established. In higher stages, the visual system presumably represents a visual scene as a collection of meaningful components such as objects and events. Our recent psychophysical studies suggest two principles related to this level of processing. First, the human visual system integrates shape and color signals along perceived motion trajectories in order to improve visibility of the shape and color of moving objects. Second, the human visual system estimates surface reflectance properties like glossiness using simple image statistics rather than by inverse computation of image formation optics. Although the underlying neural mechanisms are still under investigation, these computational principles are potentially useful for the development of effective image processing technologies and for quality assessment. Ideally, if a model can specify how a given image is transformed into high-level scene representations in the human brain, it would predict many aspects of subjective image quality, including fidelity and naturalness.

View full abstract

Download PDF (421K)
Video-Quality Estimation Based on Reduced-Reference Model Employing Activity-Difference

Toru YAMADA, Yoshihiro MIYAMOTO, Yuzo SENDA, Masahiro SERIZAWA

Article type: PAPER
Subject area: Evaluation
2009 Volume E92.A Issue 12 Pages 3284-3290
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3284

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

This paper presents a Reduced-reference based video-quality estimation method suitable for individual end-user quality monitoring of IPTV services. With the proposed method, the activity values for individual given-size pixel blocks of an original video are transmitted to end-user terminals. At the end-user terminals, the video quality of a received video is estimated on the basis of the activity-difference between the original video and the received video. Psychovisual weightings and video-quality score adjustments for fatal degradations are applied to improve estimation accuracy. In addition, low-bit-rate transmission is achieved by using temporal sub-sampling and by transmitting only the lower six bits of each activity value. The proposed method achieves accurate video quality estimation using only low-bit-rate original video information (15kbps for SDTV). The correlation coefficient between actual subjective video quality and estimated quality is 0.901 with 15kbps side information. The proposed method does not need computationally demanding spatial and gain-and-offset registrations. Therefore, it is suitable for real-time video-quality monitoring in IPTV services.

View full abstract

Download PDF (420K)
Estimation of Mosquito Noise Level from Decoded Picture

Kenji SUGIYAMA, Naoya SAGARA, Yohei KASHIMURA

Article type: PAPER
Subject area: Evaluation
2009 Volume E92.A Issue 12 Pages 3291-3296
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3291

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

With DCT coding, block artifact and mosquito noise degradations appear in decoded pictures. The control of post filtering is important to reduce degradations without causing side effects. Decoding information is useful, if the filter is inside or close to the encoder; however, it is difficult to control with independent post filtering, such as in a display. In this case, control requires the estimation of the artifact from only the decoded picture. In this work, we describe an estimation method that determines the mosquito noise block and level. In this method, the ratio of spatial activity is taken between the mosquito block and the neighboring flat block. We test the proposed method using the reconstructed pictures which are coded with different quantization scales. We recognize that the results are mostly reasonable with the different quantizations.

View full abstract

Download PDF (1218K)
Non-intrusive Packet-Layer Model for Monitoring Video Quality of IPTV Services

Kazuhisa YAMAGISHI, Takanori HAYASHI

Article type: PAPER
Subject area: Evaluation
2009 Volume E92.A Issue 12 Pages 3297-3306
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3297

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Developing a non-intrusive packet-layer model is required to passively monitor the quality of experience (QoE) during service. We propose a packet-layer model that can be used to estimate the video quality of IPTV using quality parameters derived from transmitted packet headers. The computational load of the model is lighter than that of the model that takes video signals and/or video-related bitstream information such as motion vectors as input. This model is applicable even if the transmitted bitstream information is encrypted because it uses transmitted packet headers rather than bitstream information. For developing the model, we conducted three extensive subjective quality assessments for different encoders and decoders (codecs), and video content. Then, we modeled the subjective video quality assessment characteristics based on objective features affected by coding and packet loss. Finally, we verified the model's validity by applying our model to unknown data sets different from training data sets used above.

View full abstract

Download PDF (391K)
Objective Evaluation of Components of Colour Distortions due to Image Compression

Amal PUNCHIHEWA, Jonathan ARMSTRONG, Seiichiro HANGAI, Takayuki HAMAMO ...

Article type: PAPER
Subject area: Evaluation
2009 Volume E92.A Issue 12 Pages 3307-3312
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3307

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

This paper presents a novel approach of analysing colour bleeding caused by image compression. This is achieved by isolating two components of colour bleeding, and evaluating these components separately. Although these specific components of colour bleeding have not been studied with great detail in the past, with the use of a synthetic test pattern — similar to the colour bars used to test analogue television transmissions — we have successfully isolated, and evaluated: “colour blur” and “colour ringing, ” as two separate components of colour bleeding artefact. We have also developed metrics for these artefacts, and tested these derived metrics in a series of trials aimed to test the colour reproduction performance of a JPEG codec, and a JPEG2000 codec — both implemented by the developer IrfanView. The algorithms developed to measure these artefact metrics proved to be effective tools for evaluating and benchmarking the performance of similar codecs, or different implementations of the same codecs.

View full abstract

Download PDF (655K)
Detection and Classification of Invariant Blurs

Rachel Mabanag CHONG, Toshihisa TANAKA

Article type: PAPER
Subject area: Imaging
2009 Volume E92.A Issue 12 Pages 3313-3320
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3313

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

A new algorithm for simultaneously detecting and identifying invariant blurs is proposed. This is mainly based on the behavior of extrema values in an image. It is computationally simple and fast thereby making it suitable for preprocessing especially in practical imaging applications. Benefits of employing this method includes the elimination of unnecessary processes since unblurred images will be separated from the blurred ones which require deconvolution. Additionally, it can improve reconstruction performance by proper identification of blur type so that a more effective blur specific deconvolution algorithm can be applied. Experimental results on natural images and its synthetically blurred versions show the characteristics and validity of the proposed method. Furthermore, it can be observed that feature selection makes the method more efficient and effective.

View full abstract

Download PDF (546K)
The Effects of Sensor Spectral Sensitivity, Pixel Pitch, Photon Shot Noise, and Dark Noise on Perceived Image Quality

Hideyasu KUNIBA, Roy S. BERNS

Article type: PAPER
Subject area: Imaging
2009 Volume E92.A Issue 12 Pages 3321-3327
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3321

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Image sensor noise was estimated in an approximately perceptually uniform space with a color image sensor model. Particularly, the noise level with respect to an image sensor's pixel pitch and the dark noise was investigated. It was shown that the noise level could be about half when spectral sensitivity was optimized considering noise with reduced color reproduction accuracy. It was also shown that for a 2.0µm pixel pitch sensor, the exposure index should be less than 100-150 in order to keep the noise level σ₉₄ less than 5 even if it had no dark noise, whereas the exposure index could reach about 2000-4000 for a 8.0µm pixel pitch sensor depending on the sensor sensitivity and the dark noise level.

View full abstract

Download PDF (657K)
A Simple Method to Measure MTF of Paper and Its Application for Dot Gain Analysis

Masayuki UKISHIMA, Hitomi KANEKO, Toshiya NAKAGUCHI, Norimichi TSUMURA ...

Article type: PAPER
Subject area: Printing
2009 Volume E92.A Issue 12 Pages 3328-3335
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3328

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Image quality of halftone print is significantly influenced by optical characteristics of paper. Light scattering in paper produces optical dot gain, which has a significant influence on the tone and color reproductions of halftone print. The light scattering can be quantified by the Modulation Transfer Function (MTF) of paper. Several methods have been proposed to measure the MTF of paper. However, these methods have problems in efficiency or accuracy in the measurement. In this article, a new method is proposed to measure the MTF of paper efficiently and accurately, and the dot gain effect on halftone print is analyzed. The MTF is calculated from the ratio in spatial frequency domain between the responses of incident pencil light to paper and the perfect specular reflector. Since the spatial frequency characteristic of input pencil light can be obtained from the response of perfect specular reflector, it does not need to produce the input illuminant having “ideal” impulse characteristic. Our method is experimentally efficient since only two images need to be measured. Besides it can measure accurately since the data can be approximated by the conventional MTF model. Next, we predict the reflectance distribution of halftone print using the measured MTF in microscopy in order to analyze the dot gain effect since it can clearly be observed in halftone micro-structure. Finally, a simulation is carried out to remove the light scattering effect from the predicted image. Since the simulated image is not affected by the optical dot gain, it can be applied to analyze the real dot coverage.

View full abstract

Download PDF (957K)
Face Alignment Based on Statistical Models Using SIFT Descriptors

Zisheng LI, Jun-ichi IMAI, Masahide KANEKO

Article type: PAPER
Subject area: Processing
2009 Volume E92.A Issue 12 Pages 3336-3343
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3336

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Active Shape Model (ASM) is a powerful statistical tool for image interpretation, especially in face alignment. In the standard ASM, local appearances are described by intensity profiles, and the model parameter estimation is based on the assumption that the profiles follow a Gaussian distribution. It suffers from variations of poses, illumination, expressions and obstacles. In this paper, an improved ASM framework, GentleBoost based SIFT-ASM is proposed. Local appearances of landmarks are originally represented by SIFT (Scale-Invariant Feature Transform) descriptors, which are gradient orientation histograms based representations of image neighborhood. They can provide more robust and accurate guidance for search than grey-level profiles. Moreover, GentleBoost classifiers are applied to model and search the SIFT features instead of the unnecessary assumption of Gaussian distribution. Experimental results show that SIFT-ASM significantly outperforms the original ASM in aligning and localizing facial features.

View full abstract

Download PDF (870K)
Image Restoration Based on Adaptive Directional Regularization

Osama AHMED OMER, Toshihisa TANAKA

Article type: PAPER
Subject area: Processing
2009 Volume E92.A Issue 12 Pages 3344-3354
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3344

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

This paper addresses problems appearing in restoration algorithms based on utilizing both Tikhonov and bilateral total variation (BTV) regularization. The former regularization assumes that prior information has Gaussian distribution which indeed fails at edges, while the later regularization highly depends on the selected bilateral filter's parameters. To overcome these problems, we propose a locally adaptive regularization. In the proposed algorithm, we use general directional regularization functions with adaptive weights. The adaptive weights are estimated from local patches based on the property of the partially restored image. Unlike Tikhonov regularization, it can avoid smoothness across edges by using adaptive weights. In addition, unlike BTV regularization, the proposed regularization function doesn't depend on parameters' selection. The convexity conditions as well as the convergence conditions are derived for the proposed algorithm.

View full abstract

Download PDF (506K)
An Improved Method to CABAC in the H.264/AVC Video Compression Standard

LeThanh HA, Chun-Su PARK, Seung-Won JUNG, Sung-Jea KO

Article type: PAPER
Subject area: Coding
2009 Volume E92.A Issue 12 Pages 3355-3360
Published: December 01, 2009
Released on J-STAGE: December 01, 2009

DOIhttps://doi.org/10.1587/transfun.E92.A.3355

JOURNAL RESTRICTED ACCESS

Show abstractHide abstract

Context-based Adaptive Binary Arithmetic Coding (CA-BAC) is adopted as an entropy coding tool for main profile of the video coding standard H.264/AVC. CABAC achieves higher degree of redundancy reduction by estimating the conditional probability of each binary symbol which is the input to the arithmetic coder. This paper presents an entropy coding method based on CABAC. In the proposed method, the binary symbol is coded using more precisely estimated conditional probability, thereby leading to performance improvement. We apply our method to the standard and evaluate its performance for different video sources and various quantization parameters (QP). Experiment results show that our method outperforms the original CABAC in term of coding efficiency, and the average bit-rate savings are up to 1.2%.

View full abstract

Download PDF (279K)

Register with J-STAGE for free!