专利汇可以提供A technique for object orientation detection using a feed-forward neural network专利检索,专利查询,专利分析的服务。并且,下面是A technique for object orientation detection using a feed-forward neural network专利的具体信息内容。
The present invention relates to a technique for detecting the orientation of features, such as printed text, on an object using a network including a decision arrangement in the form of a feed-forward neural network to determine the orientation of the object.
As assembly processes move towards "Just In Time" operation, automatic inspection becomes a more necessary technology. For example, a tight loop between an operation and its inspection can be created, thereby ensuring that when errors occur due to a manufacturing setup, no more than a minimum of defective items are produced. This contrasts with traditional batch manufacturing, where a lot or batch of a product may be made before an error is detected. Rapid detection offers other advantages as well. For example, in surface mounted assembly of circuit packs, components are first placed and held with an adhesive before being soldered. Therefore, an inspection system placed in-line after the placement operation can catch errors before the soldering process is performed, thereby also reducing the repair cost to a minimum.
Various arrangements have been devised for inspecting circuit boards for defects. For example, U. S. patent 4,028,827 issued to B. H. Sharp on June 7, 1977, discloses a video system for selectively ascertaining the presence and absence of, and discriminating between, at least two different types of light reflecting surface areas on articles. The system has use for inspecting circuit paths and solder connections. Another arrangement is disclosed in U.S. patent 4,578,810 issued to J. W. MacFarlane et al. on March 25, 1986 wherein an automatic visual tester detects printed wiring board (PWB) defects. The detector comprises an array of optical sensors that forms a binary image pattern of the PWB for optically inspecting the printed wire circuit.
An article by S. P. Denker et al. in Proceedings of International Test Conference, October 1984 in Philadelphia, Pa. at pages 558-563 discloses an automatic visual tester that detects printed circuit board (PCB) assembly errors using machine vision technology. In this tester, a camera is used to capture an image of the PCB, and transmits an electrical representation of this image to a computer which compares the features of the PCB image with an ideal image stored in memory to detect any assembly errors. An alternative arrangement was disclosed in the article by D. J. Svetkoff et al. in Hybrid Circuits (GB), No. 13, May 1987 at pages 5-8 wherein a technique is disclosed for the automatic inspection of component boards using both a three-dimensional map of a circuit board under test and Gray-Scale vision data. The technique is described as usable for the detection of components such as solder paste volume, and measurements of orientation.
The prior art, although inspecting various locations of elements and whether connections and all components are properly placed on a circuit board, is limited to situations where the non-defective item always appears identical to a stored image. Absent from the prior art is a method to provide a low-cost, reliable and complete inspection system that will determine the orientation of the components themselves in various stages of manufacture, such as the loading of hoppers, component placement on a board before soldering, placement of a chip on a circuit board, etc. Furthermore, a problem with the prior art systems is that these inspection systems require the stored ideal images to match the inspected components, whereas the markings on components may vary considerably, for example, due to the use of date stamps or varying printing positions and styles.
The foregoing deficiency and problem in the prior art have been solved in accordance with the present invention which relates to a method and to an apparatus, as set out in claims 1 and 6 respectively, for determining the orientation of features or markings, such as letters, numerals, trademark symbols, etc., on an object, e.g., an electrical component, to determine the feature orientation, and, in turn, the orientation of the object itself. In the present system, an image of all or part of the object is used to extract lines of symbols and individual symbols. The separated symbols are then each appropriately trimmed and scaled to provide normalized symbols before, for example, an Optimal Bayesian Decision method, in the form of a feed-forward neural network, determines the "right-side up", "upside-down" or "indeterminate" orientation of the text after a predetermined number of symbols are processed. Because the present invention can determine the orientation of features or markings without requiring a stored ideal image, it is insensitive to changes in content, e.g., date stamps and font style, while retaining the needed sensitivity to orientation.
FIG. 1 is a block diagram of a preferred arrangement for an object orientation detection system in accordance with the present invention, which system is able to read information, such as printed text or markings, on a component e.g., an electrical device or chip, and determine the orientation of the information. Such system can be used, for example, in electronic assembly lines where sometimes hundreds of components are placed on a circuit pack. In such assembly lines, the components must be, for example, first loaded into their hoppers manually, and the symmetry of the component allows an orientation error to occur, either at setup or replenishment time. While many items have an orientation mark such as a notch, bevel or dot, there is at present no standardization of these marks and, where marks are used, they are often very hard to detect with a computer vision system. Some of the key requirements of a font orientation detection problem are (1) that there is no advance knowledge of the exact nature or content of the features, for example, font style or size; (2) the features are often poorly visible, for example, the printing is is often of poor quality; (3) detection must be carried out quickly while the probability of (a) detection of incorrect orientation should be as high as possible, and (b) a "false reject" be very low; and (4) there are many features, for example, characters that are invariant, or almost invariant, to a 180 degree rotation and would be read correctly in either orientation. Regarding this latter section, it should be noted that of the set of letters A-Z and digits 0-9, about 40% (namely BHIMOQSWXZ01689) are approximately either rotation invariant or rotation conjugate-invariant, i.e., symbols after rotation by 180 degrees appear like other symbols in the set, depending on the font used. For purposes of illustration, it is assumed hereinafter that the present system will be used to determine the orientation of a chip, but it should be understood that the present system could be adapted for use in determining the orientation of any object having single or multiple features thereon, such as, for example, boxes on a conveyor belt, labels on bottles, etc. Furthermore, the system is equally applicable for use in determining the orientation of features or markings other than text. The system relies only on having as an input a set of reference images which could be arbitrary.
In the orientation detection system of FIG. 1, a video imaging device 10 is provided including, for example, (1) a video camera, (2) a frame grabber to capture the picture signals associated with a frame of the video picture; and (3) analog-to-digital (A/D) hardware to provide, for example, 512-by-512 pixel frames with a real object scale of approximately 0.002 inches per pixel, and 256 gray levels per pixel. A reference design database 11 is used to store and provide an accurate footprint or position and size for each device being viewed, such as a chip relative to a board onto which it is assembled, and the feature orientation for a correctly placed or inserted device. The resultant images of the exemplary chips from the video image device 10 are provided as one input to an Adaptive Threshold Module (ATM) 12.
In ATM 12, the footprint of the chip from design database 11 is used to direct attention to the particular component or area of interest on the image of the chip from video imaging device 10, and such resultant component or area image may now only include, for example, 200-by-130 pixels instead of the 256-by-256 pixel image provided by video imaging device 10. ATM 12 transforms the resultant component or area image into a binary image where each pixel includes only a 0 or 1 level instead of one of the exemplary 256 gray levels. Although it is possible to omit this binarization step and process the image entirely with gray scale data, the ATM binarization step is present in the preferred embodiment, because it provides a much faster and lower cost implementation. In general, automatic evaluation of a suitable threshold between the original 256 gray levels and the two binary levels is a nontrivial problem, but in the present system a simple histogram method will work well because in most cases the text appears as one color on a contrasting background, and the gray level histogram has two fairly easily resolved peaks. These peaks are resolved adaptively using iteration of two parameters, namely the radial circumspection and mass threshold, until a satisfactory solution is obtained. The radial circumspection is the radius of a neighborhood in the histogram which is examined to verify that a value is a local maximum. To constitute a peak, a gray level must be both a local maximum and have a mass exceeding the mass threshold. This technique also allows correction so that the foreground is represented as a "1" and the background as a "0", regardless of whether the printing is light-on-dark or dark-on-light, etc.
The binary image produced by ATM 12 is provided as an input to a Framer Module 13 where the symbols (letters, numbers, etc.) of the image are captured. More particularly, such symbol capture can be performed by, for example, first taking horizontal sums of the bitmap matrix of 0's and 1's of the binary image to extract lines of symbols, and then taking vertical sums within lines to extract the individual symbols. For such purpose, preset quantile thresholds can be used. More particularly, for each scanline i of the binary image, the row sum r[i] is computed of the "1" bits. Thus r[i] contains peaks for rows of text, and valleys, or gaps, between rows of text. These peaks and gaps can be obscured by noise and variation of the text. Therefore, a threshold d can be used to separate the peaks from the gaps where, for example, d=δ rmax δ is a predetermined constant as, for example 0.07, and rmax, is the largest entry in the row sum vector r[]. The process essentially starts at the top scanline and proceeds through the subsequent scanlines looking for the beginning of a peak above a certain threshold and then continues over a peak area looking for a valley below a certain threshold, etc. to separate lines of symbols. Lines or gaps which appear too small are rejected as being attributed to noise. The process is then repeated using the column sums of the binary image to extract individual symbols. It should be noted that the technique of horizontal and vertical histograms can also disclose, and allow immediate correction for, a 90 degree rotation. To filter out isolated dots which may find their way through the process of ATM 12, the Framer Module 13 can be implemented to also ignore areas with a "connected dot mass" less than a predetermined constant. While such process may bring a risk of losing part of a broken valid symbol, no untoward effects on a recognizer procedure is found to occur. It is to be understood that the Framer Module 13 can also be omitted when the symbols are always in predetermined positions, but is included in the preferred embodiment because this module allows conventionally marked electronic components, on which the printing varies considerably in position, to be suitably inspected. The Framer Module 13, therefore, provides a displacement-invariance capability to the overall system.
A Normalization Module (NM) 14 accepts the output from Framer Module 13 and "trims and scales" each extracted symbol. Trimming is performed because the extracted symbols can have accompanying white, or almost white, spaces at the sides, top and/or bottom. Scaling is performed to scale up the symbol image to occupy a predetermined standard size of, for example, 24 rows by 16 columns. In certain cases this normalization process produces distortion to the original image as, for example, thin characters such as the letter "I" will be "fattened" to occupy the 16 columns. However, this is not undesirable as both the input data and the reference vectors described below are transformed in the same manner. The alternative of not fattening an exemplary "I" has the potential disadvantage of permitting a mismatch due to a vertical misalignment of a reference and sample image. It is to be understood that the Normalization module 14 can be omitted when symbols are always of a predetermined size, but is provided in the preferred embodiment, since NM 14 provides a scale-invariance capability to the system.
The signals representing the normalized symbols are then presented to a decision module 15 where a determination is made, using the design database 11 information, as to whether the symbols are disposed in the "up" or "down" orientation, or is "indeterminate" as to its orientation. As will be described hereinafter, a preferred arrangement for decision module 15 using a preferred Optimal Detection (OD) method, which, in its particular form, can also be termed a "Feedforward" (FF) neural network method, computes the likelihood of the symbol being oriented "up" or "down", or being "indeterminate", by computing a "similarity measure", as for example the Hamming distance, between bitmaps of the captured symbol and reference images, using a lookup table. A symbol, such as a letter, will be referred to in its normal orientation as "up" and in the inverted orientation as "down". An object, such as a chip, may have its text oriented correctly in either the "up" or "down" orientation. A correctly oriented chip will be referred to herein as "right-side up" and an incorrectly oriented chip as "upside down".
More particularly, observed images are bitmaps of length N, i.e., vectors in Ω = {0, 1}N. It is assumed that there is a collection of reference images or symbols (i.e., letters, numerals, trademark logos, etc. in various fonts sizes, etc.) which appear in the "up" orientation as u1, u2,..., u3, and in the "down" orientation as d1, d2,...,ds. It is also assumed that there is a distortion process which represents both noise inherent in the images, noise due to the image capture process, and variation due to the use of font styles, sizes, etc., which are not in the reference set. This distortion process is represented hereinafter by p(x|y), meaning that reference vector y is distorted into observed vector x with a probability p(x|y). Also defined is that p(x|u) or p(x|d) is the probability that the vector x is observed, given that vector x is a distortion of a randomly chosen symbols from a reference set of u1, u2,...,us or d1, d2,...,ds, respectively. Then the probability
One formulation of the problem is then to find a partition of Ω into regions Ωu, Ωd , Ωi representing decisions "up", "down" or indeterminate". This problem is formulated as maximizing the probability of correct determination of orientation subject to a limit on the probability of a "false reject". This results in the following practical technique for optimal determination of the orientation of a single symbol.
The technique outputs:
d ("down") if p(x|d)/p(x|u)≥ λ,
u ("up") if p(x|d)/p(x|u) ≤ λ-1,
i("indeterminate") otherwise.
The parameter λ≥ 1 is then adjusted from analysis or experiment to be as small as possible, but not so small that there are an excessive number of false rejects.
Determination of orientation occurs for a chip consisting of multiple symbols: x(1), x(2), ..., x(L). Thus a conditional independence assumption:
P{x(1), x(2),... , x(L)| u} = P {x(1)| u}. P{x(2)| u}. ... P{x(L)| u},
P{x(1), x(2), ..., x(L)| d} = P{x(1)| d}. P{x(2)| d}. ... P{x(L)| d}
is adopted.
Then the optimal test becomes to output:
If the time allocated to inspecting a board is sufficient, the best possible results are obtained by examining every symbol found on the object. However, it is often possible to obtain very high certainty about the orientation of the object before all of the symbols are read. This suggests the use of a Sequential Testing procedure as disclosed in the book "Sequential Analysis" by A. Wald, Dover Publications, 1947, at pages 34-43, wherein symbols are read until the cumulative product of likelihood ratios exceeds some upper bound λ or falls below a lower bound λ-1, at which time a determination of orientation is made. This may offer the potential of a significant speedup of the process, as measured by mean time per chip. Nevertheless, in cases where bias or poor knowledge of the prior distributions exists, it may still be desirable to use a slowly growing λ function, or to limit the contribution to the product of any one observation.
An exemplary arrangement for implementing the feed-forward neural network method in order for decision module 15 to carry out the above described process is shown in FIG. 2. In FIG. 2, the inputs from Normalization Module 14 comprises separate elements of the bitmap (matrix) for a normalized symbol and designated x1 to xN. Each of the bitmap elements xi are provided as an input to a separate Input Unit (IU) 201 to 20N. More particularly, if each normalized symbol is arranged to be disposed within an exemplary bitmap matrix of 24-by-186 elements, then N would equal 384 elements and decision module 15 would include 384 Input Units 20. The output from each Input Unit 20i is distributed to each of M Pattern Units 211 to 21M, where, for example, a first half of the M pattern units 21 is used to determine the likelihoods
The internal structure of the pattern units can be serial or parallel. The remainder of this paragraph describes a method for efficiently implementing the pattern units 21 when the "similarity measure" is a Hamming distance. In the absence of special purpose hardware for computing Hamming distances, there is a technique that can be employed to compute them very quickly. The Hamming distances between bitmaps can be computed a word at a time by taking the bitwise "exclusive or," and then using a precomputed table lookup which returns the number of one bits in the resulting word. This reduces the Hamming distance calculation to several machine instructions per word, and one can calculate distances in the order of a million words per second on a contemporary microprocessor.
After the operation of elements 20 through 22 of FIG. 2 are completed, an output probability value is obtained for each symbol. These are combined in the FF method using multiplication in the Sequential Decision Unit 23. The results can then be compared against an upper and lower threshold in Sequential Decision Unit 23 to generate an overall decision for the chip, or an indecision. This procedure can proceed according to a "stopping" rule whereby the processing of further symbols on the chip is discontinued when a prescribed degree of certainty has been obtained. Finally, the result for the chip is compared in Comparator 24 against the correct orientation for the chip, which orientation is stored in "the design database", and depending upon the setting, e.g., "stop on errors", or "stop on errors plus indecisions", an output signal is provided to, for example, operate an alarm or not.
It is to be understood that the above description of the Feed-Forward Neural Network method was for purposes of explanation, and not for purposes of limitation since any other suitable method could be used to provide the appropriate decision. For example, it is possible to apply the Learning Vector Quantization (LVQ) method similar to that described by T. Kohonen in the book "Self-Organization and Associative Memory", Second Edition, Springer-Verlag at pages 199-209. For the LVQ method, the result obtained in Decision Module 15 would merely be a "vote", i.e., the number of "up" scores divided by the total number of observations, and not an explicit probability value as found in the FF method.
标题 | 发布/更新时间 | 阅读量 |
---|---|---|
一种含在线修正的航空发动机智能多变量控制方法 | 2020-05-12 | 138 |
一种舰载机导引路径规划方法 | 2020-05-12 | 403 |
基于神经计算的短波单站多目标地理坐标快速估计方法 | 2020-05-08 | 756 |
一种利用基于图论的多重交互网络机制解决视频问答问题的方法 | 2020-05-11 | 839 |
一种基于深度去噪神经网络的空间调制系统的工作方法 | 2020-05-13 | 47 |
一种基于动态注意力机制的多事件视频描述方法 | 2020-05-14 | 931 |
一种基于代码更改关键类判定的代码提交注释预测方法 | 2020-05-14 | 417 |
一种基于用户动态偏好与注意力机制的兴趣点推荐方法 | 2020-05-13 | 369 |
文本处理模型的运行方法、装置、电子设备、及存储介质 | 2020-05-14 | 95 |
具有学习教练的异步代理以及在不降低性能的情况下在结构上修改深度神经网络 | 2020-05-15 | 71 |
高效检索全球专利专利汇是专利免费检索,专利查询,专利分析-国家发明专利查询检索分析平台,是提供专利分析,专利查询,专利检索等数据服务功能的知识产权数据服务商。
我们的产品包含105个国家的1.26亿组数据,免费查、免费专利分析。
专利汇分析报告产品可以对行业情报数据进行梳理分析,涉及维度包括行业专利基本状况分析、地域分析、技术分析、发明人分析、申请人分析、专利权人分析、失效分析、核心专利分析、法律分析、研发重点分析、企业专利处境分析、技术处境分析、专利寿命分析、企业定位分析、引证分析等超过60个分析角度,系统通过AI智能系统对图表进行解读,只需1分钟,一键生成行业专利分析报告。