序号 专利名 申请号 申请日 公开(公告)号 公开(公告)日 发明人
181 CROSS-LINGUAL INITIALIZATION OF LANGUAGE MODELS EP12721634.9 2012-04-25 EP2702586B1 2018-06-06 NAKAJIMA, Kaisuke; STROPE, Brian
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for initializing language models for automatic speech recognition. In one aspect, a method includes receiving logged speech recognition results from an existing corpus that is specific to a given language and a target context, generating a target corpus by machine-translating the logged speech recognition results from the given language to a different, target language, and estimating a language model that is specific to the different, target language and the same, target context, using the target corpus.
182 TEXT RULE BASED MULTI-ACCENT SPEECH RECOGNITION WITH SINGLE ACOUSTIC MODEL AND AUTOMATIC ACCENT DETECTION EP15824083 2015-07-24 EP3172729A4 2018-04-11 PASHINE RAJAT
Embodiments are disclosed for recognizing speech in a computing system. An example speech recognition method includes receiving metadata at a generation unit that includes a database of accented substrings, generating, via the generation unit, accent-corrected phonetic data for words included in the metadata, the accent-corrected phonetic data representing different pronunciations of the words included in the metadata based on the accented substrings stored in the database, receiving, at a voice recognition engine, extracted speech data derived from utterances input by a user to the speech recognition system, and receiving, at the voice recognition engine, the accent-corrected phonetic data. The method further includes determining terminal ID(s) identifying recognized utterances in the extracted speech data, generating, accent data identifying accents detected in the recognized utterances, generating recognized speech data based on the one or more terminal IDs and the accent data, and outputting the recognized speech data to the speech-controlled device.
183 GENERATION OF LANGUAGE UNDERSTANDING SYSTEMS AND METHODS EP15831188.6 2015-12-28 EP3241214A1 2017-11-08 WILLIAMS, Jason Douglas; NIRAULA, Nobal Bikram; DASIGI, Pradeep; LAKSHMIRATAN, Aparna; ZWEIG, Geoffrey G.; KOLOBOV, Andrey; GARCIA JURADO SUAREZ, Carlos; CHICKERING, David Maxwell
Domain-specific language understanding models that may be built, tested and improved quickly and efficiently are provided. Methods, systems and devices are provided that enable a developer to build user intent detection models, language entity extraction models, and language entity resolution models quickly and without specialized machine learning knowledge. These models may be built and implemented via single model systems that enable the models to be built in isolation or in an end-to-end pipeline system that enables the models to be built and improved in a simultaneous manner.
184 VOICE RECOGNITION-BASED DIALING EP14909364.3 2014-12-30 EP3241123A1 2017-11-08 MA, Jianjun; HU, Liping; KREIFELDT, Richard Allen
A voice recognition-based dialing method and a voice recognition-based dialing system are provided. The methods includes: determining a recognition result based on a user's voice input, at least one acoustic model and at least one language model, where the at least one acoustic model and the at least one language model are obtained based on information collected in an electronic device. The system includes: obtain at least one acoustic model and at least one language model based on information collected in an electronic device; and determine a recognition result based on a user's voice input, the at least one acoustic model and the at least one language model. The acoustic models and the language models are updated based on the information collected in the electronic device, which may be helpful to the voice recognition-based dialing.
185 LANGUAGE CONTEXT SENSITIVE COMMAND SYSTEM AND METHOD EP11740235 2011-01-31 EP2531999A4 2017-03-29 TAYLOR ANDREW E; GANYARD JEFFREY P; HERZOG PAUL M
A system and method implements a command system in a speech recognition context in such a way as to enable a user to speak a voice command in a first spoken language to a computer that is operating an application in a second spoken language configuration. The command system identifies the first spoken language the user is speaking, recognizes the voice command, identifies the second spoken language of a target application, and selects the command action in the second spoken language that correlates to the voice command provided in the first spoken language.
186 METHOD AND APPARATUS FOR TRAINING LANGUAGE MODEL, AND METHOD AND APPARATUS FOR RECOGNIZING LANGUAGE EP15184200.2 2015-09-08 EP3046053A3 2016-12-21 LEE, Hodong; LEE, Hoshik; CHOI, Heeyoul; MIN, Yunhong; YOO, Sang Hyun; LEE, Yeha; LEE, Jihyun; CHOI, YoungSang

A method and apparatus for training a language model, include generating a first training feature vector sequence and a second training feature vector sequence from training data. The method is configured to perform forward estimation of a neural network based on the first training feature vector sequence, and perform backward estimation of the neural network based on the second training feature vector sequence. The method is further configured to train a language model based on a result of the forward estimation and a result of the backward estimation.

187 METHOD AND DEVICE FOR SETTING LANGUAGE TYPE EP14843024 2014-08-26 EP3043567A4 2016-08-17 ZHAN WEI; TU CHENGYI; KONG JIANHUA
188 METHOD AND APPARATUS FOR TRAINING LANGUAGE MODEL, AND METHOD AND APPARATUS FOR RECONGNIZING LANGUAGE EP15184200.2 2015-09-08 EP3046053A2 2016-07-20 LEE, Hodong; LEE, Hoshik; CHOI, Heeyoul; MIN, Yunhong; YOO, Sang Hyun; LEE, Yeha; LEE, Jihyun; CHOI, YoungSang

A method and apparatus for training a language model, include generating a first training feature vector sequence and a second training feature vector sequence from training data. The method is configured to perform forward estimation of a neural network based on the first training feature vector sequence, and perform backward estimation of the neural network based on the second training feature vector sequence. The method is further configured to train a language model based on a result of the forward estimation and a result of the backward estimation.

189 MAINTAINING AUDIO COMMUNICATION IN A CONGESTED COMMUNICATION CHANNEL EP13762650.3 2013-08-29 EP3039803A1 2016-07-06 KARIMI-CHERKANDI, Bizhan; KOUCHRI, Farrokh Mohammadzadeh; ALI, Schah Walli
The invention relates to a communication system and a method of maintaining audio communication in a congested communication channel currently bearing the transmission of speech in audio communication between a sender side and a receiver side, the communication channel having at least one signaling channel and at least one payload channel having a quality of service. During the audio communication the quality of service of the payload channel is monitored. If the quality of service of the payload channel is below a threshold the speech at the respective sender side is converted to text; and transmitted over the retained communication channel to the respective receiver side. The text may be converted back to speech at the receiver side.
190 METHOD AND DEVICES FOR LANGUAGE DETERMINATION FOR VOICE TO TEXT TRANSCRIPTION OF PHONE CALLS EP12822974.7 2012-12-06 EP2929677A1 2015-10-14 RUIZ RODRIGUEZ, Ezequiel
The present invention provides methods, devices and systems for determining a language among a plurality of languages available for a voice to text transcription of phone calls between a caller and a recipient provided by an answering machine system, NETWORK characterized in that at least two of said available languages are proposed to the caller based on: a phone country code corresponding to said caller a phone country code corresponding to said recipient, a language comprised in a set of languages available for the transcription by said answering machine system, a language selected automatically on the basis of parameters set by the caller or the recipient wherein said caller selects said language interacting with said answering machine system, and a corresponding voice message is transcribed into text of the selected language for forwarding to said recipient.
191 DEVICE AND METHOD FOR CHANGING SHAPE OF LIPS ON BASIS OF AUTOMATIC WORD TRANSLATION EP13839930.8 2013-09-05 EP2899718A1 2015-07-29 Kim, Sang Cheol

Disclosed are a device and method for changing lip shapes based on automatic word translation. When a user takes a video of his or her own face and inputs his or her voice through a microphone, the device and method for changing lip shapes based on automatic word translation separates an area in which the user's lips are located from a video taken by the camera; recognizes the user's voice; inserts a partial video to the area in which the user's lips are located, the partial video representing a lip shape for a word obtained when a specific word corresponding to the recognized voice is translated to a different language. Consequently, when the word input by the user's voice is translated to the different language, the lip shape may be automatically changed to accord with the language.

192 Apparatus and method for recognizing voice and text EP14175597.5 2014-07-03 EP2821991A1 2015-01-07 Chakladar, Subhojit

A method for recognizing a voice includes receiving, as an input, a voice involving multiple languages, recognizing a first voice of the voice by using a voice recognition algorithm matched to a preset primary language, identifying the preset primary language and a non-primary language different from the preset primary language, which are included in the multiple languages, determining a type of the non-primary language based on context information, recognizing a second voice of the voice in the non-primary language by applying a voice recognition algorithm, which is matched to the non-primary language of the determined type, to the second voice, and outputting a result of recognizing the voice which is based on a result of recognizing the first voice and a result of recognizing the second voice.

193 METHODS AND SYSTEMS FOR ADAPTING GRAMMARS IN HYBRID SPEECH RECOGNITION ENGINES FOR ENHANCING LOCAL SR PERFORMANCE EP12808947.1 2012-11-21 EP2783365A1 2014-10-01 XU, Kui; WENG, Fuliang; FENG, Zhe
A speech recognition method includes providing a processor communicatively coupled to each of a local speech recognition engine and a server-based speech recognition engine. A first speech input is inputted into the server-based speech recognition engine. A first recognition result from the server-based speech recognition engine is received at the processor. The first recognition result is based on the first speech input. The first recognition result is stored in a memory device in association with the first speech input. A second speech input is inputted into the local speech recognition engine. The first recognition result is retrieved from the memory device. A second recognition result is produced by the local speech recognition engine. The second recognition result is based on the second speech input and is dependent upon the retrieved first recognition result.
194 Information processing apparatus, sound operating system, and sound operating method for information processing apparatus EP14150611.3 2014-01-09 EP2755201A2 2014-07-16 Sekiguchi, Takaaki; Mori, Naoki; Shimizu, Atsushi

An information processing apparatus, such as, an onboard apparatus to be mounted on a vehicle, etc., for operating the vehicle, with safety, during the time when the vehicle is running, when an application is instructed through voices, wherein an in-running operation acceptance/denial list is produced, i.e., a list of words meaning an operation to be inhibited from being executed during the time when the vehicle is running, and a command acceptance/denial executing portion of the onboard apparatus determines if the command corresponding to the content, which a user speaks, can be operated or not, during the time when the vehicle is running, by referring this list and a title of the command operable through voices included in the application, and when the vehicle is running, an execution of that command is instructed to an application controller portion, if being determined operable, by referring to a running condition of the vehicle, and if being determined inoperable, the execution of that command is not instructed to the application controller portion.

195 HYBRIDIZED CLIENT-SERVER SPEECH RECOGNITION EP12713809.7 2012-02-22 EP2678861A1 2014-01-01 JUNEJA, Ajay
A recipient computing device can receive a speech utterance to be processed by speech recognition and segment the speech utterance into two or more speech utterance segments, each of which can be to one of a plurality of available speech recognizers. A first one of the plurality of available speech recognizers can be implemented on a separate computing device accessible via a data network. A first segment can be processed by the first recognizer and the results of the processing returned to the recipient computing device, and a second segment can be processed by a second recognizer implemented at the recipient computing device.
196 MASS ELECTRONIC QUESTION FILTERING AND ENHANCEMENT SYSTEM FOR AUDIO BROADCASTS AND VOICE CONFERENCES EP09789366.3 2009-09-24 EP2335239A1 2011-06-22 APPLEYARD, James, P.; WEISBARD, Keeley, L.; MATHAI, Shiju
A system for providing electronic filtering and enhancement for audio broadcasts and voice conferences. The system can comprise one or more computing devices configured to record one or more spoken segments, wherein the one or more spoken segments are comprised of utterances. The system can also include one or more electronic data processors configured to process, manage, and store the one or more spoken segments and data, wherein the at least one electronic data processor is communicatively linked to the one or more computing devices. The system can further include a speecht-to-text module configured to execute on the one or more electronic data processors, wherein the speech-to-text module converts the one or more spoken segments into a plurality of text segments. Additionally, the system can include a database module configured to execute on the one or more electronic data processors, wherein the database module stores the plurality of text segments in a queue. The system can also include a filtration-prioritization module configured to execute on the one or more electronic data processors, wherein the filtration-prioritization module is configured to filter one or more text segments of the plurality of text segments in the queue, wherein the utterances to be filtered are defined in advance of filtering. The filtration-prioritization module can also be configured to determine a relevance of the one or more text segments. The filtration-prioritization module can be further configured to prioritize the one or more text segments based upon one or more of the relevance and a similarity of the one or more text segments to other text segments of the plurality of text segments in the queue. Moreover, the filtration-prioritization module can be configured to transmit the one or more text segments to a presenter.
197 SPOKEN LANGUAGE IDENTIFICATION SYSTEM AND METHODS FOR TRAINING AND OPERATING SAME EP05789342.2 2005-09-19 EP1800293B1 2011-04-13 LI, Haizhou, c/o Institute for Infocomm Research; MA, Bin, c/o Institute for Infocomm Research; WHITE, George M., c/o Institute for Infocomm Research
A method for training a spoken language identification system to identify an unknown language as one of a plurality of known candidate languages includes the process of creating a sound inventory comprising a plurality of sound tokens, the collective plurality of sound tokens provided from a subset of the known candidate languages. The method further includes providing a plurality of training samples, each training sample composed within one of the known candidate languages. Further included is the process of generating one or more training vectors from each training database, wherein each training vector is defined as a function of said plurality of sound tokens provided from said subset of the known candidate languages. The method further includes associating each training vector with the candidate language of the corresponding training sample.
198 INTERACTIVE NATURAL LANGUAGE CALLING SYSTEM EP05813583.1 2005-12-06 EP2050263A1 2009-04-22 Simpson, Daniel John; Simpson, Kathleen Joan; Simpson, Kerrie Ann
An interactive voice response calling system (1) for automatically dialling a plurality of telephone numbers includes database (10) containing records of dialling information for a dialling campaign. A dialler (20) translates the records i the database into dialling instructions. A calling unit (40) initiates a plurality of calls based on.the dialling instruction An interactive voice response unit (30) is operably connected to the calling unit (40) upon verification by the calling unit (40) that the connected call is answered by a person. The interactive voice response unit (30) includes a natural language recognition engine that automatically determines the language of a person and responds in the determined language and storage for temporarily storing answers to the dialling campaign. The system further includes a voice print secure identification unit for verifying a voice of a subscriber, and a switch allowing the interactive voice response unit (30) to send information relating to a call to the dialler (20) for updating the dialling instructions.
199 Exploitation of language identification of media file data in speech dialog systems EP06020732.1 2006-10-02 EP1909263B1 2009-01-28 Willett, Daniel; Schwenninger, Jochen; Hennecke, Marcus; Brueckner, Raymond
200 METHOD FOR TRANSFORMING LANGUAGE INTO A VISUAL FORM EP06721331.4 2006-04-04 EP1866810A1 2007-12-19 FONG, Robert, Chin, Meng; CHONG, Billy, Nan, Choong
A computer assisted design system (100) that includes a computer system (102) and text input device (103) that may be provided with text elements from a keyboard (104). A user may also provide oral input (107) to the text input device (103) or to a voice recognition software with in-built artificial intelligence algorithms (110) which can convert spoken language into text elements. The computer system (102) includes an interaction design heuristic engine (116) that acts to understand and translate text and language into a visual form for display to the end user.
QQ群二维码
意见反馈