序号 专利名 申请号 申请日 公开(公告)号 公开(公告)日 发明人
221 Verfahren zum Erkennen zusammenhängend gesprochener Wörter EP88200598.6 1988-03-30 EP0285222A2 1988-10-05 Ney, Hermann, Dr.; Noll, Andreas

Bei der Erkennung werden Sprachwerte, die aus Abtastwerten des Sprachsignals gewonnen werden, mit Referenzwerten verglichen, wobei die Wörter eines vorgegebenen Vokabulars jeweils durch eine Folge von Referenzwerten gegeben sind. Die Wörter werden dabei aus Phonemen gemäß einem vorgege­benen Aussprachelexikon vorgegeben, und die Referenzwerte für die Phoneme werden in einer Lernphase bestimmt, wobei jedes Phonem innerhalb eines Wortes aus einer in der Lern­phase bestimmten Anzahl von gleichen Referenzwerten besteht. Um Übergänge zwischen Phonemen anzunähern, kann jedes Phonem auch aus drei Abschnitten jeweils konstanter Referenzwerte bestehen. Durch die vorgegebene Anzahl Referenzwerte je Phonem kann die zeitliche Dauer eines Phonems in einem bestimmten Wort besser nachgebildet werden. Es werden verschiedene Möglichkeiten angegeben, die Referenzwerte und den Abstandswert bei der Erkennung zu bestimmen.

222 Pattern matching system EP84112926.5 1984-10-26 EP0144689B1 1988-08-24 Watari, Masao
This pattern matching system comprises a pattern matching control unit (3) for gathering an L number of time points i of the input pattern into a block, a distance computation unit (1) with an input pattern buffer (11) having its address specified by the signal of said control unit, and a reference pattern buffer (12) for temporarily storing the vector @n_j at the j-th frame of the reference pattern so that the input pattern @i and the reference pattern @n_j may be read out from the input pattern buffer and the reference pattern buffer to derive the inter-vector distance d(@i, @n_j). An asymptotic equation computation unit includes a difference memory. The asymptotic equation of a dynamic programming is computed for the time point of each reference pattern and for each time point in one block of the input pattern with reference to the distance d(@, @n_j) under the connecting condition of the dissimilarity Gn(j) at the time point (ii-1)-L for the block number ii of the time point of the input pattern, thereby to derive the dissimilarity Gn(j) at the time point ii-L of the input pattern.
223 VOICE RECOGNITION EP87904962.5 1987-07-30 EP0275327A1 1988-07-27 FUJIMOTO, Junichiroh; YASUDA, Seigou; NAKATANI, Tomofumi

Voice signals are picked up through a microphone (31) and voice sections are detected by a voice section detector (32). The voice signals are then processed with every predetermined time Interval by a frequency analyzer (33) having a predetermined number of channels, and the voice signal portions corresponding to respective channels are quantized. The quantized data obtained from these channels are converted into binary datas (34) to form frames each composed of a series of binary data. Each data of the frame corresponds to a channel and is preferably set to an integer number of times of the operation unit (e.g., 4 bits, 8 bits, etc.) of a computer. When a synthetic frame is to be formed by superposing two or more of such frames, the synthetic frame is divided Into hierarchies so that each bit may be displayed by binary notation in each hierarchy. It is further possible to divide a frame into a plurality of subframes in order to effect preparatory comparison using the subframes.

224 FRAME COMPARISON METHOD FOR WORD RECOGNITION IN HIGH NOISE ENVIRONMENTS EP87900768.0 1986-12-29 EP0255529A1 1988-02-10 GERSON, Ira, Alan; LINDSLEY, Brett, Louis
Un procédé et un agencement sont appliqués dans un système de reconnaissance de la parole qui utilise des informations tirées de blocs de canaux pour représenter la parole. Le procédé comprend la détermination de trois niveaux d'énergie pour chaque canal: le premier représente l'énergie du bruit de fond (20), le deuxième représente l'énergie des séquences d'entrée (16) et le troisième représente l'énergie des séquences de modèles de mots (18). Des valeurs représentant des niveaux différentiels d'énergie sont assignées à chaque canal. Si le deuxième niveau d'énergie est inférieur au premier niveau, une valeur constante prédéterminée est assignée à ce canal particulier. Ces valeurs sont combinées pour générer une mesure de distance descriptive des similarités entre les deux séquences.
225 Pattern matching device EP82101757.1 1982-03-05 EP0059959B1 1987-11-19 Tsuruta, Shichiro; Sakoe, Hiroaki
The pattern matching device carries out pattern matching between two information compressed patterns applying a dynamic programming technique on calculating a similarity measure between the patterns. On carrying out matching of the two patterns a weighted similarity measure calculator (64) calculates a weighted similarity measure by multiplying and intervector similarity measure between one feature vector of each of the respective patterns by a weighting factor calculated by the use of a variable interval between each feature vector and the previous one. A recurrence formula is calculated by use of such weighted similarity measures instead of the intervector similarity measures. A predetermined value delta may be used in reducing the number of signal bits used for the recurrence formula. Preferably, a sum for the recurrence formula is restricted by two preselected values. Most preferably, an additional similarity measure is used for the recurrence formula.
226 VOICE RECOGNITION PROCESS UTILIZING CONTENT ADDRESSABLE MEMORY EP86906682.0 1986-10-30 EP0243475A1 1987-11-04 SCHAIRE, Scott
L'inconvénient des systèmes de reconnaissance de la parole de la technique antérieure est que la parole à reconnaitre doit être comparée à toutes les entrées du vocabulaire afin de trouver l'équivalent le plus proche, ce qui est à l'origine de la lenteur du système. Dans le système actuel, comme chaque mot est prononcé pendant une phase d'initialisation ou de formation, il est numérisé (12) et comprimé en temps (14) afin de former une valeur de repérage (16) qui correspond directement à une adresse en mémoire (18) qui mémorise un indicateur de l'emplacement des mots dans une mémoire de vocabulaire (20). Lors d'une phase ultérieure de reconnaissance vocale, tout mot prononcé est transformé en une valeur de repérage et le mot est lu à partir de la mémoire de vocabulaire (20) à adressage indirect sans qu'il soit nécessaire d'effectuer des comparaisons avec les données mémorisées dans chaque emplacement de mémoire.
227 Speech recognition system EP87302604.1 1987-03-25 EP0241183A1 1987-10-14 Kaneko, Toyohisa; Watanuki, Osaaki

The present invention relates to a speech recognition system of the type which comprises storage means (10, 11) for storing selected parameters for each of a plurality of words in a vocabulary to be used for recognition of an input item of speech, comparison means (42) for comparing parameters of each unknown word in an input item of speech with the stored parameters, and indication means (12, 46) responsive to the result of the comparison operation for indicating which of the plurality of vocabulary words most closely resembles each unknown input word.

According to the invention the speech recognition system is characterised in that the stored parameters comprise for each vocabulary word a set of labels each representing a feature of the vocabulary word occurring at a respective segmentation point in the vocabulary word and the probability of the feature associated with each label occurring at a segmentation point in a word. Further, the comparison means compares the stored sets of parameters with a set of labels for each unknown input word each representing a feature of the unknown input word occurring at a respective segmentation point in the unknown input word.

228 Recognition system EP86305495.3 1986-07-16 EP0214728A1 1987-03-18 Stentiford, Frederick Warwick Michael

A waveform recognition system comprising a plurality of detectors of features comprising the combined presence at a plurality of instants spaced at predetermined intervals relative to each other in time of instantaneous amplitudes each satisfying respectively predetermined constraints; means for assigning a plurality of labels and corresponding confidence measures to each of successive portions of said waveform in dependence on the features detected in said portions and storing each label in a buffer corresponding to the rank of the confidence with which the label is assigned relative to other labels assigned to the same portion of data and means for outputting labels from that buffer containing labels assigned with the highest confidence whose confidence measures are in a predetermined relationship with those of adjacent labels in the same buffer when the confidence measures of labels in other buffers containing labels assigned with confidence measures of lower rank satisfy predetermined conditions.

229 Recognition of speech or speech-like sounds EP82304782.4 1982-09-10 EP0074822B1 1987-03-18 Hakaridani, Mitsuhiro; Iwahashi, Hiroyuki; Nishioka, Yoshiki
An improved method of speech recognition suitable for use in simple type speech recognition systems is disclosed. The method of speech recognition uses short time self-correlation functions as feature parameters for recognition of speech or speech-like words and especially effecting preliminary selection utilizing part of data for final recognition, that is, short time self-correlation functions of lower degrees (typically, primary to cubic). In carrying out recognition of speech or speech-like sounds, the method involves creating self-correlation functions for input sound signals, deciding the intervals of the sound signals, normalizing the time axies in conjunction with the sound intervals, and conducting recognization of words or the like through deciding using the self-correlation functions as feature parameters whether there is matching with reference patterns. In the above method, preliminary selection is effected prior to the final recognization step by means of linear matching using the self-correlation functions of lower degrees.
230 Speech recognition system EP82302232.2 1982-04-29 EP0065829B1 1986-08-20 Hitchcock, Myron H.
A microcomputer is used for speaker independent speech recognition with a carefully selected vocabulary which may be manufactured at extremely low cost for specialised applications. A circuit (13) generates an AC electrical signal having a frequency determined by the speech input. A detector (33) produces digital signals by comparing the AC electrical signal with a threshold electrical signal level. A counting circuit (31) is connected to the detector for counting the digital signals within time intervals defined by a clock circuit (35) to generate digital count signals. The digital count signals are analysed to produce a speech identifying digital signal, and the speech identifying digital signal is compared with a plurality of speech template digital signals stores in a memory (19) to determine if the speech identifying digital signal corresponds to a matching one of the plurality of speech templates. An output signal generator (23) identifies the matching speech template.
231 Method of and device for the recognition, without previous training of connected words belonging to small vocabularies EP85110986.8 1985-08-30 EP0173986A2 1986-03-12 Colombo, Maura; Pirani, Giancorlo

The method consists in classifying the sounds forming the uttered words into eight phonetic classes plus a possible indication of the presence of diphtongs, starting from an acoustic-phonetic analysis of the sounds themselves.

To recognize the uttered words the sequence of classes found out are analyzed by search-tree algorithms of pattern matching with sequences of classes corresponding to vocabulary words, and possibly by dynamic programming algorithms.

The detected classes are: silence, voiced fricatives unvoiced fricatives, plosives, affricates, nasals, semivowels, vowels.

A device for implementing the method is also described.

232 Methods of and apparatus for speech recognition EP85303666.3 1985-05-23 EP0164945A1 1985-12-18 Watari, Masao c/o Sony Corporation; Sako, Yoichiro c/o Sony Corporation; Akabane, Makoto c/o Sony Corporation; Hiraiwa, Atsunobu c/o Sony Corporation

A speech recognition method and apparatus in which a trajectory drawn by a plurality of time-sequential acoustic parameters obtained from segmented speech signals in the parameter space thereof time-normalizes the segmented speech signals (2, 17) and this trajectory is matched (25, 6) with a standard time-normalized trajectory, which was previously registered (4), to perform the speech recognition. A silence acoustic parameter (30) can also be added to the time-sequential acoustic parameter produced from the segmented speech signals to obtain the trajectory thereof. The length of the trajectory plotted from the segmented speech signals is determined and is used to select the previously registered standard trajectory (4) to be matched. Therefore, in addition to checking the distance between the'segmented speech trajectory and the registered standard trajectory to obtain a match (25, 6), the lengths of the two trajectories are also taken into account, thereby increasing the recognition ratio.

233 Pattern matching method and apparatus therefor EP85104177.2 1985-04-04 EP0162255A1 1985-11-27 Watari, Masao c/o NEC Corporation

The pattern matching method and apparatus described herein comprises a step of determining a distance d(m,j) between each feature vector b of a reference pattern and a feature vector am of an input pattern at each point (m,j) of the i-th block defined in the form of a parallelogram which has the IL frame width of an input pattern and is inclined with respect to said reference time axis, a step of conducting DP matching calculations on the basis of the distance determined; and a step of setting the boundary value of an (i+1) -th slant (oblique) block at values at such a slant line contacting with an i-th block as is obtained by said DP matching thereby to effect said DP matching calculations. This pattern matching method and apparatus can recognise at a high accuracy and at a high speed an input string of words spoken in compliance with various grammars. Furthermore the DP matching can be processed at a high speed with less transfer of pattern data and with less capacity of pattern buffer memories.

234 SPEECH RECOGNITION METHODS AND APPARATUS. EP83901536 1983-03-28 EP0139642A4 1985-11-07 BAKER JAMES K; MACALLISTER JEFFREY G; KLOVSTAD JOHN W; SIDELL MARK F; BROWN PETER F; GANESAN KALYAN; HATTON TERENCE J; LEE CHIN-HUI; ROSS STEVEN; ROTH ROBERT S
A speech recognition method and apparatus employ a speech processing circuity (26) for repetitively deriving from a speech imput (100), at a frame repetition rate, a plurality of acoustic parameters. The acoustic parameters represent the speech input signal for a frame time. A plurality of template matching and cost processing circuitries (28, 30) are connected to a system bus (24), along with the speech processing circuity, for determining, or identifying, the speech units in the input speech, by comparing the acoustic parameters with stored template patterns. The apparatus can be expanded by adding more template matching and cost processing circuity to the bus thereby increasing the speech recognition capacity of the apparatus. The speech processing circuity establishes overlapping time durations for generating the acoustic parameters and further employs a sinc-Kaiser smoothing function in combination with a folding technique (113) for providing a discrete Fourier transform (112). The Fourier spectra are transformed using a principal component analysis (122) which optimizes the across class variance. The template matching and cost processing circuitries (28, 30) provide distributed processing, on demand, of the acoustic parameters for generating through a dynamic programming technique the recognition decision. Grammar and word model syntax structures reduce the computational load. Template pattern generation is aided by using a "joker" word to specify the time boundaries of utterances spoken in isolation.
235 Verfahren zur Bewertung der Ähnlichkeit jeweils zweier digital dargestellter Zahlenfolgen, insbesondere Funktionskurven EP85102096.6 1985-02-26 EP0157165A1 1985-10-09 Luchner, Stefan; Lagger, Helmut, Dr.

Ein Verfahren zur Bewertung der Ähnlichkeit jeweils zweier digital dargestellter Zahlenfolgen, insbesondere Funktionskurven, bei dem jeweils Bestandteile oder Bitgruppen (AKF) zweier Binärwörter (A, B) mittels Schwellwerten in Beziehung zueinander gesetzt werden, um Distanzwerte (di) zu bilden, aus denen eine Ähnlichkeitsbewertungsgröße ableitbar ist. Zur Durchführung des Verfahrens ist eine kaskadierte Anordnung von PROMs (P1 ... P16; P1 ... PIV; P17) und Addieren (ADD" ADD2) vorgesehen.

236 Pattern recognition apparatus and method for making same EP83300434 1983-01-27 EP0085545A3 1985-09-18 Kenichi, Maeda c/o Patent Division; Tsuneo, Nitta c/o Patent Division

A pattern recognition apparatus is provided including a vector generating unit for generating an input vector representing the characteristics of an unknown input pattern, a dictionary unit which stores a plurality of reference vectors for each category, a similarlity calculating unit which calculates a similarity between the input vector and a plurality of reference vectors for each category, a comparing unit which determines the category to which the input pattern belongs by comparing the similarities derived from the similarity calculating unit, and an additional dictionary generating unit which generates additional reference vectors for a particular need or application.

The additional dictionary generating unit generates an additional reference vector obtained by subtracting the components of at least common reference vectors, in the specified category, from the input vector. The additional reference vector is stored in the storage area within the additional dictionary memory corresponding to a specified category.

237 APPARATUS AND METHOD FOR ARTICULATORY SPEECH RECOGNITION. EP82902772 1982-08-04 EP0114814A4 1985-06-26 KELLETT HENRY G
The parallel application of a plurality of vocal tract filters (12) is utilized in a speech recognition apparatus. Each of the filters in this bank of filters has a complex Fourier transfer function that is the reciprocal of a particular vocaltract transfer function corresponding to a particular speech sound. The speech elements identified are phoneme segments of short duration (typically 10 milliseconds), and correspond to phonemes only in the case of time invariant (sustainable) phonemes. In a preferred embodiment, it is assumed that the bank of inverse filters, when designed to correspond with the sustainable phonemes, is capable also of closely matching (in a piecewise fashion) the time-varying (transitional) phonemes. The input of each filter in the bank of filters is connected to the speech waveform input (30). The output of each filter is examined (18) to determine which output has the smallest absolute value. Generally, the filters are designed so that when a filter channel has the smallest output at a given time, there is usually present at the input of the filter bank at that time the waveform of the sound such filter was designed to detect. In a preferred embodiment, the apparatus determines (22) which filter's output is minimum in absolute value for the greatest total time over a given short interval of time, and the associated sound is specified as the sound present at the input.
238 Pattern matching system EP84112926.5 1984-10-26 EP0144689A1 1985-06-19 Watari, Masao

This pattern matching system comprises a pattern matching control unit (3) for gathering an L number of time points i of the input pattern into a block, a distance computation unit (1) with an input pattern buffer (11) having its address specified by the signal of said control unit, and a reference pattern buffer (12) for temporarily storing the vector at the j-th frame of the reference pattern so that the input pattem and the reference pattern may be read out from the input pattern buffer and the reference pattern buffer to derive the inter-vector distance . An asymptotic equation computation unit includes a difference memory. The asymptotic equation of a dynamic programming is computed for the time point of each reference pattern and for each time point in one block of the input pattern with reference to the distance under the connecting condition of the dissimilarity Gn(j) at the time point (ii-1)·L for the block number ii of the time point of the input pattern, thereby to derive the dissimilarity Gn(j) at the time point ii-L of the input pattern.

239 Pattern matching apparatus EP84108509.5 1984-07-18 EP0139875A1 1985-05-08 Sakoe, Hiroaki

The pattern matching apparatus comprises a reference pattern supplying means for supplying a reference feature sequence pattern containing control operators for controlling branching and/or omission, an input pattern supplying means for supplying an input pattern of input feature sequence, and a distance computing section for computing the distance between the feature of said input pattern and said reference feature sequence pattern. A work memory has addresses adapted to be appointed in accordance with the - time point in said reference feature sequence pattern, and is adapted to store the cumulative distance. A recurrence formula computing section executes a DP matching recurrence formula computation in accordance with a plurality of values read out of the work memory and the distance resulting cumulative distance. The control operator and the position at which the control operator appears is stored in a stack and a stack processing section has a stack control section which, when the control operator is detected, conducts the PUSH/ POP operation of said stack in accordance with the kind of the detected control operator, thereby to control the DP matching recurrence formula computation which is to be conducted in the recurrence formula computing section. This pattern matching apparatus can deal with various possible deformations of the pattern even with reduced capacity of the reference pattern memory. Furthermore, the apparatus is most suited to the continuousty uttered words recognition in synchronism with the input speech.

240 Pattern matching apparatus EP84111915.9 1984-10-04 EP0138166A1 1985-04-24 Sakoe, Hiroaki

The pattern matching apparatus comprises means (20) for supplying a reference pattern consisting of a time sequence of features in which an address information is contained, means (40) for supplying an input pattern of time sequence of features, a work memory (50) for storing a plurality of cumulative distances, means (60) for computing a DP matching recurrence formula on the basis of a cumulative distance read out from the work memory (50), and control means (10) for supplying an address signal to the work memory (50) so as to control the reading address of the work memory (50) based on the address information. This pattern matching apparatus carries out a DP matching operation in accordance with various types of deformation of patterns. Furthermore, the apparatus is capable of efficiently DP matching with a reference pattern expressed in a general automation form.

QQ群二维码
意见反馈