-
Shinji KIMURA
2009 Volume E92.A Issue 12 Pages
2961
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
-
Yuji TAKASHIMA, Kazuyuki OOYA, Atsushi KUROKAWA
Article type: PAPER
Subject area: Physical Level Design
2009 Volume E92.A Issue 12 Pages
2962-2970
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
As the integrated circuit technology has undergone continuous downscaling to improve the LSI performance and reduce chip size, design for manufacturability (DFM) and design for yield (DFY) have become very important. As one of the DFM/DFY methods, a redundant via insertion technique uses as many vias as possible to connect the metal wires between different layers. In this paper, we focus on redundant vias and propose an effective redundant via insertion method for practical use to address the manufacturing variability and reliability concerns. First, the results of statistical analysis for via resistance and via capacitance in some real physical layouts are shown, and the impact on circuit delay of the resistance variation of vias caused by manufacturing variability is clarified. Then, the valuation functions of delay variation, electro-migration (EM), and stress-migration (SM) are defined and a practical method concerning redundant via insertion is proposed. Experimental results show that LSI with redundant vias inserted by our method robust against manufacturing variability and reliability problems.
View full abstract
-
Yukihide KOHIRA, Suguru SUEHIRO, Atsushi TAKAHASHI
Article type: PAPER
Subject area: Physical Level Design
2009 Volume E92.A Issue 12 Pages
2971-2978
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
In recent VLSI systems, signal propagation delays are requested to achieve the specifications with very high accuracy. In order to meet the specifications, the routing of a net often needs to be detoured in order to increase the routing delay. A routing method should utilize a routing area with obstacles as much as possible in order to realize the specifications of nets simultaneously. In this paper, a fast longer path algorithm that generates a path of a net in routing grid so that the length is increased as much as possible is proposed. In the proposed algorithm, an upper bound for the length in which the structure of a routing area is taken into account is used. Experiments show that our algorithm utilizes a routing area with obstacles efficiently.
View full abstract
-
Yuchun MA, Xin LI, Yu WANG, Xianlong HONG
Article type: PAPER
Subject area: Physical Level Design
2009 Volume E92.A Issue 12 Pages
2979-2989
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
In 3D IC design, thermal issue is a critical challenge. To eliminate hotspots, physical layouts are always adjusted by some incremental changes, such as shifting or duplicating hot blocks. In this paper, we distinguish the thermal-aware incremental changes in three different categories: migrating computation, growing unit and moving hotspot blocks. However, these modifications may degrade the packing area as well as interconnect distribution greatly. In this paper, mixed integer linear programming (MILP) models are devised according to these different incremental changes so that multiple objectives can be optimized simultaneously. Furthermore, to avoid random incremental modification, which may be inefficient and need long runtime to converge, here potential gain is modeled for each candidate incremental change. Based on the potential gain, a novel thermal optimization flow to intelligently choose the best incremental operation is presented. Experimental results show that migrating computation, growing unit and moving hotspot can reduce max on-chip temperature by 7%, 13% and 15% respectively on MCNC/GSRC benchmarks. Still, experimental results also show that the thermal optimization flow can reduce max on-chip temperature by 14% to the initial packings generated by an existing 3D floorplanning tool CBA, and achieve better area and total wirelength improvement than individual operations do. The results with the initial packings from CBA_T (Thermal-aware CBA floorplanner) show that 13.5% temperature reduction can be obtained by our incremental optimization flow.
View full abstract
-
Bei YU, Sheqin DONG, Song CHEN, Satoshi GOTO
Article type: PAPER
Subject area: Physical Level Design
2009 Volume E92.A Issue 12 Pages
2990-2997
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
Low Power Design has become a significant requirement when the CMOS technology entered the nanometer era. Multiple-Supply Voltage (MSV) is a popular and effective method for both dynamic and static power reduction while maintaining performance. Level shifters may cause area and Interconnect Length Overhead (ILO), and should be considered at both floorplanning and post-floorplanning stages. In this paper, we propose a two phases algorithm framework, called VLSAF, to solve voltage and level shifter assignment problem. At floorplanning phase, we use a convex cost network flow algorithm to assign voltage and a minimum cost flow algorithm to handle level-shifter assignment. At post-floorplanning phase, a heuristic method is adopted to redistribute white spaces and calculate the positions and shapes of level shifters. The experimental results show VLSAF is effective.
View full abstract
-
Yoichi TOMIOKA, Yoshiaki KURATA, Yukihide KOHIRA, Atsushi TAKAHASHI
Article type: PAPER
Subject area: Physical Level Design
2009 Volume E92.A Issue 12 Pages
2998-3006
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
In this paper, we propose a routing method for 2-layer ball grid array packages that generates a routing pattern satisfying a design rule. In our proposed method, the routing structure on each layer is restricted while keeping most of feasible patterns to efficiently obtain a feasible routing pattern. A routing pattern that satisfies the design rule is formulated as a mixed integer linear programming. In experiments with seven data, we obtain a routing pattern such that satisfies the design rule within a practical time by using a mixed integer linear programming solver.
View full abstract
-
Qiang FU, Wai-Shing LUK, Jun TAO, Xuan ZENG, Wei CAI
Article type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages
3007-3015
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
In this paper, a novel intra-die spatial correlation extraction method referred to as MLEMTC (Maximum Likelihood Estimation for Multiple Test Chips) is presented. In the MLEMTC method, a joint likelihood function is formulated by multiplying the set of individual likelihood functions for all test chips. This joint likelihood function is then maximized to extract a unique group of parameter values of a single spatial correlation function, which can be used for statistical circuit analysis and design. Moreover, to deal with the purely random component and measurement error contained in measurement data, the spatial correlation function combined with the correlation of white noise is used in the extraction, which significantly improves the accuracy of the extraction results. Furthermore, an LU decomposition based technique is developed to calculate the log-determinant of the positive definite matrix within the likelihood function, which solves the numerical stability problem encountered in the direct calculation. Experimental results have shown that the proposed method is efficient and practical.
View full abstract
-
Tsuyoshi SAKATA, Takaaki OKUMURA, Atsushi KUROKAWA, Hidenari NAKASHIMA ...
Article type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages
3016-3023
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
Leakage current is an important qualitative metric of LSI (Large Scale Integrated circuit). In this paper, we focus on reduction of leakage current variation under the process variation. Firstly, we derive a set of quadratic equations to evaluate delay and leakage current under the process variation. Using these equations, we discuss the cases of varying leakage current without degrading delay distribution and propose a procedure to reduce the leakage current variations. From the experiments, we show the proposed method effectively reduces the leakage current variation up to 50% at 90 percentile point of the distribution compared with the conventional design approach.
View full abstract
-
Xu LUO, Fan YANG, Xuan ZENG, Jun TAO, Hengliang ZHU, Wei CAI
Article type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages
3024-3034
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
In this paper, we propose a Modified nested sparse grid based Adaptive Stochastic Collocation Method (MASCM) for block-based Statistical Static Timing Analysis (SSTA). The proposed MASCM employs an improved adaptive strategy derived from the existing Adaptive Stochastic Collocation Method (ASCM) to approximate the key operator MAX during timing analysis. In contrast to ASCM which uses non-nested sparse grid and tensor product quadratures to approximate the MAX operator for weakly and strongly nonlinear conditions respectively, MASCM proposes a modified nested sparse grid quadrature to approximate the MAX operator for both weakly and strongly nonlinear conditions. In the modified nested sparse grid quadrature, we firstly construct the second order quadrature points based on extended Gauss-Hermite quadrature and nested sparse grid technique, and then discard those quadrature points that do not contribute significantly to the computation accuracy to enhance the efficiency of the MAX approximation. Compared with the non-nested sparse grid quadrature, the proposed modified nested sparse grid quadrature not only employs much fewer collocation points, but also offers much higher accuracy. Compared with the tensor product quadrature, the modified nested sparse grid quadrature greatly reduced the computational cost, while still maintains sufficient accuracy for the MAX operator approximation. As a result, the proposed MASCM provides comparable accuracy while remarkably reduces the computational cost compared with ASCM. The numerical results show that with comparable accuracy MASCM has 50% reduction in run time compared with ASCM.
View full abstract
-
Yu LIU, Masato YOSHIOKA, Katsumi HOMMA, Toshiyuki SHIBUYA
Article type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages
3035-3043
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
This paper presents a novel method using multi-objective optimization algorithm to automatically find the
best solution from a
topology library of analog circuits. Firstly this method abstracts the Pareto-front of each topology in the library by SPICE simulation. Then, the Pareto-front of the
topology library is abstracted from the individual Pareto-fronts of topologies in the library followed by the theorem we proved. The
best solution which is defined as the nearest point to specification on the Pareto-front of the
topology library is then calculated by the equations derived from collinearity theorem. After the local searching using Nelder-Mead method maps the calculated
best solution backs to design variable space, the non-dominated
best solution is obtained. Comparing to the traditional optimization methods using single-objective optimization algorithms, this work can efficiently find the best non-dominated solution from multiple topologies for different specifications without additional time-consuming optimizing iterations. The experiments demonstrate that this method is feasible and practical in actual analog designs especially for uncertain or variant multi-dimensional specifications.
View full abstract
-
Mohammad SOLEIMANI, Abdollah KHOEI, Khayrollah HADIDI, Vahid Fagih DIN ...
Article type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages
3044-3051
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
In this paper, new structure of Voltage-Mode MAX-MIN circuit are presented for nonlinear systems, fuzzy applications, neural network and etc. A differential pair with improved cascode current mirror is used to choose the desired input. The advantages of the proposed structure are high operating frequency, high precision, low power consumption, low area and simple expansion for multiple inputs by adding only three transistors for each extra input. The proposed circuit which is simulated by HSPICE in 0.35µm CMOS process shows the total power consumption of 85µW in 5MHz operating frequency from a single 3.3-V supply. Also, the total area of the proposed circuit is about 420µm
2 for two input voltages, and would be negligibly increased for each extra input.
View full abstract
-
Bo YANG, Shigetoshi NAKATAKE
Article type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages
3052-3060
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
This paper addresses the problem of optimizing metalization patterns of back-end connections for the power-MOSFET based driver since the back-end connections tend to dominate the on-resistance
Ron of the driver. We propose a heuristic algorithm to seek for better geometric shapes for the patterns targeting at minimizing
Ron and at balancing the current distribution. In order to speed up the analysis, the equivalent resistance network of the driver is modified by inserting ideal switches to avoid repeatedly inverting the admittance matrix. With the behavioral model of the ideal switch, we can significantly accelerate the optimization. Simulation on three drivers from industrial TEG data demonstrates that our algorithm can reduce
Ron effectively by shaping metals appropriately within a given routing area.
View full abstract
-
Duo LI, Sheldon X.-D. TAN
Article type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages
3061-3069
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
In this paper, we present a novel analysis approach for large on-chip power grid circuit analysis. The new approach, called
ETBR for extended truncated balanced realization, is based on model order reduction techniques to reduce the circuit matrices before the simulation. Different from the (improved) extended Krylov subspace methods EKS/IEKS [2],[3], ETBR performs fast truncated balanced realization on response Gramian to reduce the original system. ETBR also avoids the adverse explicit moment representation of the input signals. Instead, it uses spectrum representation in frequency domain for input signals by fast Fourier transformation. The proposed method is very amenable for threading-based parallel computing, as the response Gramian is computed in a Monte-Carlo-like sampling style and each sampling can be computed in parallel. This contrasts with all the Krylov subspace based methods like the EKS method, where moments have to be computed in a sequential order. ETBR is also more flexible for different types of input sources and can better capture the high frequency contents than EKS, and this leads to more accurate results especially for fast changing input signals. Experimental results on a number of large networks (up to one million nodes) show that, given the same order of the reduced model, ETBR is indeed more accurate than the EKS method especially for input sources rich in high-frequency components. If parallel computing is explored, ETBR can be an order of magnitude faster than the EKS/IEKS method.
View full abstract
-
Takayuki FUKUOKA, Akira TSUCHIYA, Hidetoshi ONODERA
Article type: PAPER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages
3070-3078
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
In this paper, we propose a calculation method of gate delay for SSTA (Statistical Static Timing Analysis) considering MIS (Multiple Input Switching). In SSTA, statistical maximum/minimum operation is necessary to calculate the latest/fastest arrival time of multiple input gate. Most SSTA approaches calculate the distribution in the latest/fastest arrival time under SIS (Single Input Switching assumption), resulting in ignoring the effect of MIS on the gate delay and the output transition time. MIS occurs when multiple inputs of a gate switch nearly simultaneously. Thus, ignoring MIS causes error in the statistical maximum/minimum operation in SSTA. We propose a statistical gate delay model considering MIS. We verify the proposed method by SPICE based Monte Carlo simulations. Experimental results show that the neglect of MIS effect leads to 80% error in worst case. The error of the proposed method is less than 20%.
View full abstract
-
Ken UENO, Tetsuya HIROSE, Tetsuya ASAI, Yoshihito AMEMIYA
Article type: LETTER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages
3079-3081
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
A voltage-controlled oscillator (VCO) tolerant to process variations at lower supply voltage was proposed. The circuit consists of an on-chip threshold-voltage-monitoring circuit, a current-source circuit, a body- biasing control circuit, and the delay cells of the VCO. Because variations in low-voltage VCO frequency are mainly determined by that of the current in delay cells, a current-compensation technique was adopted by using an on-chip threshold-voltage-monitoring circuit and body-biasing circuit techniques. Monte Carlo SPICE simulations demonstrated that variations in the oscillation frequency by using the proposed techniques were able to be suppressed about 65% at a 1-V supply voltage, compared to frequencies with and without the techniques.
View full abstract
-
Yonghee PARK, Junghoe CHOI, Jisuk HONG, Sanghoon LEE, Moonhyun YOO, Ju ...
Article type: LETTER
Subject area: Device and Circuit Modeling and Analysis
2009 Volume E92.A Issue 12 Pages
3082-3085
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
The researches on predicting and removing of lithographic hot-spots have been prevalent in recent semiconductor industries, and known to be one of the most difficult challenges to achieve high quality detection coverage. To provide physical design implementation with designer's favors on fixing hot-spots, in this paper, we present a noble and accurate hot-spot detection method, so-called “
leveling and scoring” algorithm based on weighted combination of image quality parameters (i.e., normalized image log-slope (NILS), mask error enhancement factor (MEEF), and depth of focus (DOF)) from lithography simulation. In our algorithm, firstly, hot-spot scoring function considering severity level is calibrated with process window qualification, and then least-square regression method is used to calibrate weighting coefficients for each image quality parameter. In this way, after we obtain the scoring function with wafer results, our method can be applied to future designs of using the same process. Using this calibrated scoring function, we can successfully generate fixing guidance and rule to detect hot-spot area by locating edge bias value which leads to a hot-spot-free score level. Finally, we integrate the hot-spot fixing guidance information into layout editor to facilitate the user-favorable design environment. Applying our method to memory devices of 60nm node and below, we could successfully attain sufficient process window margin to yield high mass production.
View full abstract
-
Yanni ZHAO, Jinian BIAN, Shujun DENG, Zhiqiu KONG, Kang ZHAO
Article type: PAPER
Subject area: Logic Synthesis, Test and Verification
2009 Volume E92.A Issue 12 Pages
3086-3093
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
Despite the growing research effort in formal verification, industrial verification often relies on the constrained random simulation methodology, which is supported by constraint solvers as the stimulus generator integrated within simulator, especially for the large design with complex constraints nowadays. These stimulus generators need to be fast and well-distributed to maintain simulation performance. In this paper, we propose a dynamic method to guide stimulus generation by SAT solvers. An adjusting strategy named Tabu Search with Memory (TSwM) is integrated in the stimulus generator for the search and prune processes along with the constraint solver. Experimental results show that the method proposed in this paper could generate well-distributed stimuli with good performance.
View full abstract
-
Hiroshi FUKETA, Masanori HASHIMOTO, Yukio MITSUYAMA, Takao ONOYE
Article type: PAPER
Subject area: Logic Synthesis, Test and Verification
2009 Volume E92.A Issue 12 Pages
3094-3102
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
Timing margin of a chip varies chip by chip due to manufacturing variability, and depends on operating environment and aging. Adaptive speed control with timing error prediction is promising to mitigate the timing margin variation, whereas it inherently has a critical risk of timing error occurrence when a circuit is slowed down. This paper presents how to evaluate the relation between timing error rate and power dissipation in self-adaptive circuits with timing error prediction. The discussion is experimentally validated using adders in subthreshold operation in a 90nm CMOS process. We show a trade-off between timing error rate and power dissipation, and reveal the dependency of the trade-off on design parameters.
View full abstract
-
Qing DONG, Bo YANG, Jing LI, Shigetoshi NAKATAKE
Article type: PAPER
Subject area: Logic Synthesis, Test and Verification
2009 Volume E92.A Issue 12 Pages
3103-3110
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
This paper presents an efficient algorithm for incremental buffer insertion and module resizing for a full-placed floorplan. Our algorithm offers a method to use the white space in a given floorplan to resize modules and insert buffers, and at the same time keeps the resultant floorplan as close to the original one as possible. Both the buffer insertion and module resizing are modeled as geometric programming problems, and can be solved extremely efficiently using new developed solution methods. The experimental results suggest that the the wire length difference between the initial floorplan and result are quite small (less than 5%), and the global structure of the initial floorplan are preserved very well.
View full abstract
-
Lei CHEN, Shinji KIMURA
Article type: PAPER
Subject area: Logic Synthesis, Test and Verification
2009 Volume E92.A Issue 12 Pages
3111-3118
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
In this paper, a new heuristic algorithm is proposed to optimize the power domain clustering in controlling-value-based (CV-based) power gating technology. In this algorithm, both the switching activity of sleep signals (
p) and the overall numbers of sleep gates (gate count,
N) are considered, and the sum of the product of
p and
N is optimized. The algorithm effectively exerts the total power reduction obtained from the CV-based power gating. Even when the maximum depth is kept to be the same, the proposed algorithm can still achieve power reduction approximately 10% more than that of the prior algorithms. Furthermore, detailed comparison between the proposed heuristic algorithm and other possible heuristic algorithms are also presented. HSPICE simulation results show that over 26% of total power reduction can be obtained by using the new heuristic algorithm. In addition, the effect of dynamic power reduction through the CV-based power gating method and the delay overhead caused by the switching of sleep transistors are also shown in this paper.
View full abstract
-
Youhua SHI, Nozomu TOGAWA, Masao YANAGISAWA, Tatsuo OHTSUKI
Article type: PAPER
Subject area: Logic Synthesis, Test and Verification
2009 Volume E92.A Issue 12 Pages
3119-3127
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
This paper presents a novel X-handling technique, which removes the effect of unknowns on compacted test response with maximal compaction ratio. The proposed method combines with the current X-tolerant compactors and inserts masking cells on scan paths to selectively mask X's. By doing this, the number of unknown responses in each scan-out cycle could be reduced to a reasonable level such that the target X-tolerant compactor would tolerate with guaranteed possible error detection. It guarantees no test loss due to the effect of X's, and achieves the maximal compaction that the target response compactor could provide as well. Moreover, because the masking cells are only inserted on the scan paths, it has no performance degradation of the designs. Experimental results demonstrate the effectiveness of the proposed method.
View full abstract
-
Yoshinobu HIGAMI, Kewal K. SALUJA, Hiroshi TAKAHASHI, Shin-ya KOBAYASH ...
Article type: PAPER
Subject area: Logic Synthesis, Test and Verification
2009 Volume E92.A Issue 12 Pages
3128-3135
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
Shorts and opens are two major kind of defects that are most likely to occur in Very Large Scale Integrated Circuits. In modern Integrated Circuit devices these defects must be considered not only at gate-level but also at transistor level. In this paper, we propose a method for generating test vectors that targets both transistor shorts (tr-shorts) and transistor opens (tr-opens). Since two consecutive test vectors need to be applied in order to detect tr-opens, we assume launch on capture (LOC) test application mechanism. This makes it possible to detect delay type defects. Further, the proposed method employs existing stuck-at test generation tools thus requiring no change in the design and development flow and development of no new tools is needed. Experimental results for benchmark circuits demonstrate the effectiveness of the proposed method by providing 100% fault efficiency while the test set size is still moderate.
View full abstract
-
Kosuke SHIOKI, Narumi OKADA, Toshiro ISHIHARA, Tetsuya HIROSE, Nobutak ...
Article type: PAPER
Subject area: Logic Synthesis, Test and Verification
2009 Volume E92.A Issue 12 Pages
3136-3142
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
This paper presents an error diagnosis technique for incremental synthesis, called EXL
LS (Extended X-algorithm for LUT-based circuit model based on Location sets to rectify Subcircuits), which rectifies five or more functional errors in the whole circuit based on location sets to rectify subcircuits. Conventional error diagnosis technique, called EXL
IT, tries to rectify five or more functional errors based on incremental rectification for subcircuits. However, the solution depends on the selection and the order of modifications on subcircuits, which increases the number of locations to be changed. To overcome this problem, we propose EXL
LS based on location sets to rectify subcircuits, which obtains two or more solutions by separating i) extraction of location sets to be rectified, and ii) rectification for the whole circuit based on the location sets. Thereby EXL
LS can rectify five or more errors with fewer locations to change. Experimental results have shown that EXL
LS reduces increase in the number of locations to be rectified with conventional technique by 90.1%.
View full abstract
-
Ya-Shih HUANG, Yu-Ju HONG, Juinn-Dar HUANG
Article type: PAPER
Subject area: High-Level Synthesis and System-Level Design
2009 Volume E92.A Issue 12 Pages
3143-3150
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
In deep-submicron technology, several state-of-the-art architectural synthesis flows have already adopted the distributed register architecture to cope with the increasing wire delay by allowing multicycle communication. In this article, we regard communication synthesis targeting a refined regular distributed register architecture, named RDR-GRS, as a problem of simultaneous data transfer routing and scheduling for global interconnect resource minimization. We also present an innovative algorithm with regard of both spatial and temporal perspectives. It features both a concentration-oriented path router gathering wire-sharable data transfers and a channel-based time scheduler resolving contentions for wires in a channel, which are in spatial and temporal domain, respectively. The experimental results show that the proposed algorithm can significantly outperform existing related works.
View full abstract
-
Junbo YU, Qiang ZHOU, Gang QU, Jinian BIAN
Article type: PAPER
Subject area: High-Level Synthesis and System-Level Design
2009 Volume E92.A Issue 12 Pages
3151-3159
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
High temperature adversely impacts on circuit's reliability, performance, and leakage power. During behavioral synthesis, both resource usage allocation and resource binding influence thermal profile. Current thermal-aware behavioral syntheses do not utilize location information of resources from floorplan and in addition only focus on binding, ignoring allocation. This paper proposes thermal-aware behavioral synthesis with resource usage allocation. Based on a hybrid metric of physical location information and temperature, we rebind operations and reallocate the number of resources under area constraint. Our approach effectively controls peak temperature and creates even power densities among resources of different types and within resources of the same type. Experimental results show an average of 8.6°C drop in peak temperature and 5.3% saving of total power consumption with little latency overhead.
View full abstract
-
Florin BALASA, Ilie I. LUICAN, Hongwei ZHU, Doru V. NASUI
Article type: PAPER
Subject area: High-Level Synthesis and System-Level Design
2009 Volume E92.A Issue 12 Pages
3160-3168
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
Many signal processing systems, particularly in the multimedia and telecommunication domains, are synthesized to execute data-intensive applications: their cost related aspects — namely power consumption and chip area — are heavily influenced, if not dominated, by the data access and storage aspects. This paper presents an energy-aware memory allocation methodology. Starting from the high-level behavioral specification of a given application, this framework performs the assignment of the multidimensional signals to the memory layers — the on-chip scratch-pad memory and the off-chip main memory — the goal being the reduction of the dynamic energy consumption in the memory subsystem. Based on the assignment results, the framework subsequently performs the mapping of signals into both memory layers such that the overall amount of data storage be reduced. This software system yields a complete allocation solution: the exact storage amount on each memory layer, the mapping functions that determine the exact locations for any array element (scalar signal) in the specification, and an estimation of the dynamic energy consumption in the memory subsystem.
View full abstract
-
Akira OHCHI, Nozomu TOGAWA, Masao YANAGISAWA, Tatsuo OHTSUKI
Article type: PAPER
Subject area: High-Level Synthesis and System-Level Design
2009 Volume E92.A Issue 12 Pages
3169-3179
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
As device feature size decreases, interconnection delay becomes the dominating factor of circuit total delay.
Distributed-register architectures can reduce the influence of interconnection delay. They may, however, increase circuit area because they require many local registers. Moreover original distributed-register architectures do not consider control signal delay, which may be the bottleneck in a circuit. In this paper, we propose a high-level synthesis method targeting
generalized distributed-register architecture in which we introduce shared/local registers and global/local controllers. Our method is based on iterative improvement of scheduling/binding and floorplanning. First, we prepare shared-register groups with global controllers, each of which corresponds to a single functional unit. As iterations proceed, we use local registers and local controllers for functional units on a critical path. Shared-register groups physically located close to each other are merged into a single group. Accordingly, global controllers are merged. Finally, our method obtains a generalized distributed-register architecture where its scheduling/binding as well as floorplanning are simultaneously optimized. Experimental results show that the area is decreased by 4.7% while maintaining the performance of the circuit equal with that using original distributed-register architectures.
View full abstract
-
Gi-Ho PARK, Jung-Wook PARK, Gunok JUNG, Shin-Dug KIM
Article type: LETTER
Subject area: High-Level Synthesis and System-Level Design
2009 Volume E92.A Issue 12 Pages
3180-3181
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
This paper presents a wordline gating logic for reducing unnecessary BTB accesses. Partial bit of the branch predictor was simultaneously recorded in the middle of BTB to prevent further SRAM operation. Experimental results with embedded applications showed that the proposed mechanism reduces around 38% of BTB power consumption.
View full abstract
-
Farhad MEHDIPOUR, Hamid NOORI, Koji INOUE, Kazuaki MURAKAMI
Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages
3182-3192
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
Multitude parameters in the design process of a reconfigurable instruction-set processor (RISP) may lead to a large design space and remarkable complexity. Quantitative design approach uses the data collected from applications to satisfy design constraints and optimize the design goals while considering the applications' characteristics; however it highly depends on designer observations and analyses. Exploring design space can be considered as an effective technique to find a proper balance among various design parameters. Indeed, this approach would be computationally expensive when the performance evaluation of the design points is accomplished based on the synthesis-and-simulation technique. A combined analytical and simulation-based model (CAnSO) is proposed and validated for performance evaluation of a typical RISP. The proposed model consists of an analytical core that incorporates statistics collected from cycle-accurate simulation to make a reasonable evaluation and provide a valuable insight. CAnSO has clear speed advantages and therefore it can be used for easing a cumbersome design space exploration of a reconfigurable RISP processor and quick performance evaluation of slightly modified architectures.
View full abstract
-
Liang-Bi CHEN, Chi-Tsai YEH, Hung-Yu CHEN, Ing-Jer HUANG
Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages
3193-3202
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
3D graphics application is widely used in consumer electronics which is an inevitable tendency in the future. In general, the higher abstraction level is used to model a complex system like 3D graphics SoC. However, the concerned issue is that how to use efficient methods to traverse design space hierarchically, reduce simulation time, and refine the performance fast. This paper demonstrates a system-level design space exploration model for a tile-based 3D graphics SoC refinement. This model uses UML tools which can assist designers to traverse the whole system and reduces simulation time dramatically by adopting SystemC. As a result, the system performance is improved 198% at geometry function and 69% at rendering function, respectively.
View full abstract
-
Dajiang ZHOU, Jinjia ZHOU, Jiayi ZHU, Satoshi GOTO
Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages
3203-3210
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
In this paper, a highly parallel deblocking filter architecture for H.264/AVC is proposed to process one macroblock in 48 clock cycles and give real-time support to QFHD@60fps sequences at less than 100MHz. 4 edge filters organized in 2 groups for simultaneously processing vertical and horizontal edges are applied in this architecture to enhance its throughput. While parallelism increases, pipeline hazards arise owing to the latency of edge filters and data dependency of deblocking algorithm. To solve this problem, a zig-zag processing schedule is proposed to eliminate the pipeline bubbles. Data path of the architecture is then derived according to the processing schedule and optimized through data flow merging, so as to minimize the cost of logic and internal buffer. Meanwhile, the architecture's data input rate is designed to be identical to its throughput, while the transmission order of input data can also match the zig-zag processing schedule. Therefore no intercommunication buffer is required between the deblocking filter and its previous component for speed matching or data reordering. As a result, only one 24×64 two-port SRAM as internal buffer is required in this design. When synthesized with SMIC 130nm process, the architecture costs a gate count of 30.2k, which is competitive considering its high performance.
View full abstract
-
Yue QIAN, Zhonghai LU, Wenhua DOU
Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages
3211-3220
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
We investigate per-flow flit and packet worst-case delay bounds in on-chip wormhole networks. Such investigation is essential in order to provide guarantees under worst-case conditions in cost-constrained systems, as required by many hard real-time embedded applications. We first propose analysis models for flow control, link and buffer sharing. Based on these analysis models, we obtain an open-ended service analysis model capturing the combined effect of flow control, link and buffer sharing. With the service analysis model, we compute equivalent service curves for individual flows, and then derive their flit and packet delay bounds. Our experimental results verify that our analytical bounds are correct and tight.
View full abstract
-
Ming-Chih CHEN, Shen-Fu HSIAO
Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages
3221-3228
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
In this paper, we propose an area-efficient design of Advanced Encryption Standard (AES) processor by applying a new common-expression-elimination (CSE) method to the sub-functions of various transformations required in AES. The proposed method reduces the area cost of realizing the sub-functions by extracting the common factors in the bit-level XOR/AND-based sum-of-product expressions of these sub-functions using a new CSE algorithm. Cell-based implementation results show that the AES processor with our proposed CSE method has significant area improvement compared with previous designs.
View full abstract
-
Ryuta NARA, Nozomu TOGAWA, Masao YANAGISAWA, Tatsuo OHTSUKI
Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages
3229-3237
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
A scan chain is one of the most important testing techniques, but it can be used as side-channel attacks against a cryptography LSI. We focus on scan-based attacks, in which scan chains are targeted for side-channel attacks. The conventional scan-based attacks only consider the scan chain composed of only the registers in a cryptography circuit. However, a cryptography LSI usually uses many circuits such as memories, micro processors and other circuits. This means that the conventional attacks cannot be applied to the practical scan chain composed of various types of registers. In this paper, a scan-based attack which enables to decipher the secret key in an AES cryptography LSI composed of an AES circuit and other circuits is proposed. By focusing on bit pattern of the specific register and monitoring its change, our scan-based attack eliminates the influence of registers included in other circuits than AES. Our attack does not depend on scan chain architecture, and it can decipher practical AES cryptography LSIs.
View full abstract
-
Nobuaki TOJO, Nozomu TOGAWA, Masao YANAGISAWA, Tatsuo OHTSUKI
Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages
3238-3247
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
Recently, two-level cache, L1 cache and L2 cache, is commonly used in a processor. Particularly in an embedded system whereby a single application or a class of applications is repeatedly executed on a processor, its cache configuration can be customized such that an optimal one is achieved. An optimal two-level cache configuration can be obtained which minimizes overall memory access time or memory energy consumption by varying the three cache parameters: the number of sets, a line size, and an associativity, for L1 cache and L2 cache. In this paper, we first extend the L1 cache simulation algorithm so that we can explore two-level cache configuration. Second, we propose two-level cache design space exploration algorithms: CRCB-T1 and CRCB-T2, each of which is based on applying
Cache Inclusion Property to two-level cache configuration. Each of the proposed algorithms realizes exact cache simulation but decreases the number of cache hit/miss judgments by a factor of several thousands. Experimental results show that, by using our approach, the number of cache hit/miss judgments required to optimize a cache configurations is reduced to 1/50-1/5500 compared to the exhaustive approach. As a result, our proposed approach totally runs an average of 1398.25 times faster compared to the exhaustive approach. Our proposed cache simulation approach achieves the world fastest two-level cache design space exploration.
View full abstract
-
Sumek WISAYATAKSIN, Dongju LI, Tsuyoshi ISSHIKI, Hiroaki KUNIEDA
Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages
3248-3257
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
An entropy decoding engine plays an important role in modern multimedia decoders. Previous researches that focused on the decoding performance paid a considerable attention to only one parameter such as the data parsing speed, but they did not consider the performance caused by a table configuration time and memory size. In this paper, we developed a novel method of entropy decoding based on the two step group matching scheme. Our approach achieves the high performance on both data parsing speed and configuration time with small memory needed. We also deployed our decoding scheme to implement an entropy decoding processor, which performs operations based on normal processor instructions and VLD instructions for decoding variable length codes. Several extended VLD instructions are prepared to increase the bitstream parsing process in modern multimedia applications. This processor provides a solution with software flexibility and hardware high speed for stand-alone entropy decoding engines. The VLSI hardware is designed by the Language for Instruction Set Architecture (LISA) with 23Kgates and 110MHz maximum clock frequency under TSMC 0.18µm technology. The experimental simulations revealed that proposed processor achieves the higher performance and suitable for many practical applications such as MPEG-2, MPEG-4, H.264/AVC and AAC.
View full abstract
-
Takuji HIEDA, Hiroaki TANAKA, Keishi SAKANUSHI, Yoshinori TAKEUCHI, Ma ...
Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages
3258-3267
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
Partial forwarding is a design method to place forwarding paths on a part of processor pipeline. Hardware cost of processor can be reduced without performance loss by partial forwarding. However, compiler with the instruction scheduler which considers partial forwarding structure of the target processor is required since conventional scheduling algorithm cannot make the most of partial forwarding structure. In this paper, we propose a heuristic instruction scheduling method for processors with partial forwarding structure. The proposed algorithm uses available distance to schedule instructions which are suitable for the target partial forwarding processor. Experimental results show that the proposed method generates near-optimal solutions in practical time and some of the optimized codes for partial forwarding processor run in the shortest time among the target processors. It also shows that the proposed method is superior to hazard detection unit.
View full abstract
-
Taiga TAKATA, Yusuke MATSUNAGA
Article type: PAPER
Subject area: Embedded, Real-Time and Reconfigurable Systems
2009 Volume E92.A Issue 12 Pages
3268-3275
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
Recent technology mappers for LUT based FPGAs employ cut enumeration. Although many cuts are often needed to find a good network, enumerating all the cuts with large size consumes a lot of run-time. Existing algorithms employ the bottom-up merging which calculates Cartesian products of the fanins' cuts for each node. The number of cuts is much smaller than the size of the Cartesian products in most cases. Thus, the existing algorithms are inefficient. Furthermore, the number of cuts exponentially increases with the size of cuts, that makes the run-time much longer. Several algorithms to enumerate not all the cuts but partial cuts have been presented [8],[9], but they tend to disturb the quality of networks. This paper presents two algorithms to enumerate cuts; an exhaustive enumeration and a partial enumeration. Both of them are efficient because they do not employ the bottom-up merging. The partial enumeration reduces the number of enumerated cuts with a guarantee that a depth-minimum network can be constructed. The experimental results show that the exhaustive enumeration runs about 5 and 13 times faster than the existing bottom-up algorithm [12] for
K=8, 9 respectively, while keeping the same results. On the other hand, the partial enumeration runs about 9 and 29 times faster than the existing algorithm for
K =8, 9, respectively. The average area of networks derived by the sets of cuts enumerated by the partial enumeration is only 4% larger than that derived with using all the cuts, and the depth is the same.
View full abstract
-
Mitsuho YAMADA
2009 Volume E92.A Issue 12 Pages
3276
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
-
Shin'ya NISHIDA
Article type: INVITED PAPER
2009 Volume E92.A Issue 12 Pages
3277-3283
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
Objective assessment of image and video quality should be based on a correct understanding of subjective assessment by human observers. Previous models have incorporated the mechanisms of early visual processing in image quality metrics, enabling us to evaluate the visibility of errors from the original images. However, to understand how human observers perceive image quality, one should also consider higher stages of visual processing where perception is established. In higher stages, the visual system presumably represents a visual scene as a collection of meaningful components such as objects and events. Our recent psychophysical studies suggest two principles related to this level of processing. First, the human visual system integrates shape and color signals along perceived motion trajectories in order to improve visibility of the shape and color of moving objects. Second, the human visual system estimates surface reflectance properties like glossiness using simple image statistics rather than by inverse computation of image formation optics. Although the underlying neural mechanisms are still under investigation, these computational principles are potentially useful for the development of effective image processing technologies and for quality assessment. Ideally, if a model can specify how a given image is transformed into high-level scene representations in the human brain, it would predict many aspects of subjective image quality, including fidelity and naturalness.
View full abstract
-
Toru YAMADA, Yoshihiro MIYAMOTO, Yuzo SENDA, Masahiro SERIZAWA
Article type: PAPER
Subject area: Evaluation
2009 Volume E92.A Issue 12 Pages
3284-3290
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
This paper presents a Reduced-reference based video-quality estimation method suitable for individual end-user quality monitoring of IPTV services. With the proposed method, the activity values for individual given-size pixel blocks of an original video are transmitted to end-user terminals. At the end-user terminals, the video quality of a received video is estimated on the basis of the activity-difference between the original video and the received video. Psychovisual weightings and video-quality score adjustments for fatal degradations are applied to improve estimation accuracy. In addition, low-bit-rate transmission is achieved by using temporal sub-sampling and by transmitting only the lower six bits of each activity value. The proposed method achieves accurate video quality estimation using only low-bit-rate original video information (15kbps for SDTV). The correlation coefficient between actual subjective video quality and estimated quality is 0.901 with 15kbps side information. The proposed method does not need computationally demanding spatial and gain-and-offset registrations. Therefore, it is suitable for real-time video-quality monitoring in IPTV services.
View full abstract
-
Kenji SUGIYAMA, Naoya SAGARA, Yohei KASHIMURA
Article type: PAPER
Subject area: Evaluation
2009 Volume E92.A Issue 12 Pages
3291-3296
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
With DCT coding, block artifact and mosquito noise degradations appear in decoded pictures. The control of post filtering is important to reduce degradations without causing side effects. Decoding information is useful, if the filter is inside or close to the encoder; however, it is difficult to control with independent post filtering, such as in a display. In this case, control requires the estimation of the artifact from only the decoded picture. In this work, we describe an estimation method that determines the mosquito noise block and level. In this method, the ratio of spatial activity is taken between the mosquito block and the neighboring flat block. We test the proposed method using the reconstructed pictures which are coded with different quantization scales. We recognize that the results are mostly reasonable with the different quantizations.
View full abstract
-
Kazuhisa YAMAGISHI, Takanori HAYASHI
Article type: PAPER
Subject area: Evaluation
2009 Volume E92.A Issue 12 Pages
3297-3306
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
Developing a non-intrusive packet-layer model is required to passively monitor the quality of experience (QoE) during service. We propose a packet-layer model that can be used to estimate the video quality of IPTV using quality parameters derived from transmitted packet headers. The computational load of the model is lighter than that of the model that takes video signals and/or video-related bitstream information such as motion vectors as input. This model is applicable even if the transmitted bitstream information is encrypted because it uses transmitted packet headers rather than bitstream information. For developing the model, we conducted three extensive subjective quality assessments for different encoders and decoders (codecs), and video content. Then, we modeled the subjective video quality assessment characteristics based on objective features affected by coding and packet loss. Finally, we verified the model's validity by applying our model to unknown data sets different from training data sets used above.
View full abstract
-
Amal PUNCHIHEWA, Jonathan ARMSTRONG, Seiichiro HANGAI, Takayuki HAMAMO ...
Article type: PAPER
Subject area: Evaluation
2009 Volume E92.A Issue 12 Pages
3307-3312
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
This paper presents a novel approach of analysing colour bleeding caused by image compression. This is achieved by isolating two components of colour bleeding, and evaluating these components separately. Although these specific components of colour bleeding have not been studied with great detail in the past, with the use of a synthetic test pattern — similar to the colour bars used to test analogue television transmissions — we have successfully isolated, and evaluated: “colour blur” and “colour ringing, ” as two separate components of colour bleeding artefact. We have also developed metrics for these artefacts, and tested these derived metrics in a series of trials aimed to test the colour reproduction performance of a JPEG codec, and a JPEG2000 codec — both implemented by the developer IrfanView. The algorithms developed to measure these artefact metrics proved to be effective tools for evaluating and benchmarking the performance of similar codecs, or different implementations of the same codecs.
View full abstract
-
Rachel Mabanag CHONG, Toshihisa TANAKA
Article type: PAPER
Subject area: Imaging
2009 Volume E92.A Issue 12 Pages
3313-3320
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
A new algorithm for simultaneously detecting and identifying invariant blurs is proposed. This is mainly based on the behavior of extrema values in an image. It is computationally simple and fast thereby making it suitable for preprocessing especially in practical imaging applications. Benefits of employing this method includes the elimination of unnecessary processes since unblurred images will be separated from the blurred ones which require deconvolution. Additionally, it can improve reconstruction performance by proper identification of blur type so that a more effective blur specific deconvolution algorithm can be applied. Experimental results on natural images and its synthetically blurred versions show the characteristics and validity of the proposed method. Furthermore, it can be observed that feature selection makes the method more efficient and effective.
View full abstract
-
Hideyasu KUNIBA, Roy S. BERNS
Article type: PAPER
Subject area: Imaging
2009 Volume E92.A Issue 12 Pages
3321-3327
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
Image sensor noise was estimated in an approximately perceptually uniform space with a color image sensor model. Particularly, the noise level with respect to an image sensor's pixel pitch and the dark noise was investigated. It was shown that the noise level could be about half when spectral sensitivity was optimized considering noise with reduced color reproduction accuracy. It was also shown that for a 2.0µm pixel pitch sensor, the exposure index should be less than 100-150 in order to keep the noise level σ
94 less than 5 even if it had no dark noise, whereas the exposure index could reach about 2000-4000 for a 8.0µm pixel pitch sensor depending on the sensor sensitivity and the dark noise level.
View full abstract
-
Masayuki UKISHIMA, Hitomi KANEKO, Toshiya NAKAGUCHI, Norimichi TSUMURA ...
Article type: PAPER
Subject area: Printing
2009 Volume E92.A Issue 12 Pages
3328-3335
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
Image quality of halftone print is significantly influenced by optical characteristics of paper. Light scattering in paper produces optical dot gain, which has a significant influence on the tone and color reproductions of halftone print. The light scattering can be quantified by the Modulation Transfer Function (MTF) of paper. Several methods have been proposed to measure the MTF of paper. However, these methods have problems in efficiency or accuracy in the measurement. In this article, a new method is proposed to measure the MTF of paper efficiently and accurately, and the dot gain effect on halftone print is analyzed. The MTF is calculated from the ratio in spatial frequency domain between the responses of incident pencil light to paper and the perfect specular reflector. Since the spatial frequency characteristic of input pencil light can be obtained from the response of perfect specular reflector, it does not need to produce the input illuminant having “ideal” impulse characteristic. Our method is experimentally efficient since only two images need to be measured. Besides it can measure accurately since the data can be approximated by the conventional MTF model. Next, we predict the reflectance distribution of halftone print using the measured MTF in microscopy in order to analyze the dot gain effect since it can clearly be observed in halftone micro-structure. Finally, a simulation is carried out to remove the light scattering effect from the predicted image. Since the simulated image is not affected by the optical dot gain, it can be applied to analyze the real dot coverage.
View full abstract
-
Zisheng LI, Jun-ichi IMAI, Masahide KANEKO
Article type: PAPER
Subject area: Processing
2009 Volume E92.A Issue 12 Pages
3336-3343
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
Active Shape Model (ASM) is a powerful statistical tool for image interpretation, especially in face alignment. In the standard ASM, local appearances are described by intensity profiles, and the model parameter estimation is based on the assumption that the profiles follow a Gaussian distribution. It suffers from variations of poses, illumination, expressions and obstacles. In this paper, an improved ASM framework, GentleBoost based SIFT-ASM is proposed. Local appearances of landmarks are originally represented by SIFT (Scale-Invariant Feature Transform) descriptors, which are gradient orientation histograms based representations of image neighborhood. They can provide more robust and accurate guidance for search than grey-level profiles. Moreover, GentleBoost classifiers are applied to model and search the SIFT features instead of the unnecessary assumption of Gaussian distribution. Experimental results show that SIFT-ASM significantly outperforms the original ASM in aligning and localizing facial features.
View full abstract
-
Osama AHMED OMER, Toshihisa TANAKA
Article type: PAPER
Subject area: Processing
2009 Volume E92.A Issue 12 Pages
3344-3354
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
This paper addresses problems appearing in restoration algorithms based on utilizing both Tikhonov and bilateral total variation (BTV) regularization. The former regularization assumes that prior information has Gaussian distribution which indeed fails at edges, while the later regularization highly depends on the selected bilateral filter's parameters. To overcome these problems, we propose a locally adaptive regularization. In the proposed algorithm, we use general directional regularization functions with adaptive weights. The adaptive weights are estimated from local patches based on the property of the partially restored image. Unlike Tikhonov regularization, it can avoid smoothness across edges by using adaptive weights. In addition, unlike BTV regularization, the proposed regularization function doesn't depend on parameters' selection. The convexity conditions as well as the convergence conditions are derived for the proposed algorithm.
View full abstract
-
LeThanh HA, Chun-Su PARK, Seung-Won JUNG, Sung-Jea KO
Article type: PAPER
Subject area: Coding
2009 Volume E92.A Issue 12 Pages
3355-3360
Published: December 01, 2009
Released on J-STAGE: December 01, 2009
JOURNAL
RESTRICTED ACCESS
Context-based Adaptive Binary Arithmetic Coding (CA-BAC) is adopted as an entropy coding tool for main profile of the video coding standard H.264/AVC. CABAC achieves higher degree of redundancy reduction by estimating the conditional probability of each binary symbol which is the input to the arithmetic coder. This paper presents an entropy coding method based on CABAC. In the proposed method, the binary symbol is coded using more precisely estimated conditional probability, thereby leading to performance improvement. We apply our method to the standard and evaluate its performance for different video sources and various quantization parameters (QP). Experiment results show that our method outperforms the original CABAC in term of coding efficiency, and the average bit-rate savings are up to 1.2%.
View full abstract