序号 专利名 申请号 申请日 公开(公告)号 公开(公告)日 发明人
181 ソーシャル・グラフ、会話モデル、およびユーザ・コンテキストを介した、モバイル装置のユーザ付近の人の特定 JP2014519046 2012-06-28 JP5930432B2 2016-06-08 レオナルド・ヘンリー・グロコップ; ヴィディヤ・ナラヤナン
182 声紋特徴モデルを更新するための方法及び端末 JP2015509296 2013-07-08 JP2015516091A 2015-06-04 ▲ティン▼ ▲盧▼
本発明は、音声認識技術の分野に適用可能であり、声紋特徴モデルを更新するための方法及び端末を提供する。本方法は、少なくとも1人の話者を含むオリジナル・オーディオ・ストリームを取得するステップと、プリセット話者セグメンテーション及びクラスタリング・アルゴリズムによりオリジナル・オーディオ・ストリームにおける少なくとも1人の話者に係る各話者それぞれのオーディオ・ストリームを取得するステップと、少なくとも1人の話者に係る各話者それぞれのオーディオ・ストリームとオリジナル声紋特徴モデルとを別々にマッチングして、うまくマッチングしたオーディオ・ストリームを取得するステップと、オリジナル声紋特徴モデルを生成するためにうまくマッチングしたオーディオ・ストリームを追加のオーディオ・ストリーム訓練サンプルとして使用するステップと、オリジナル声紋特徴モデルを更新するステップとを含む。本発明において、通話中の有効なオーディオ・ストリームは、適応的に抽出され、追加のオーディオ・ストリーム訓練サンプルとして使用され、それによって、オリジナル声紋特徴モデルを動的補正し、それによって、比較的高い実用性を前提にして声紋特徴モデルの精度及び認識精度を改善する目的を達成する。
183 話者認識のためのモデリング・デバイスおよび方法、ならびに話者認識システム JP2013542329 2010-12-10 JP5681811B2 2015-03-11 ハイフォン シェン; ロン マー; ビンチー チャン
184 Device and method of the passphrase modeling for speaker verification and speaker verification system, JP2013542330 2010-12-10 JP2014502375A 2014-01-30 ロン マー; ハイフォン シェン; ビンチー チャン
話者照合のためのパスフレーズ・モデリングのデバイスおよび方法ならびに話者照合システムが提供される。 デバイスは、目標話者から登録音声を受け取るフロント・エンドと、登録音声に基づいて一般話者モデルによりパスフレーズ・テンプレートを生成するテンプレート生成ユニットとを備える。 デバイス、方法、およびシステムでは、一般話者モデルに含まれる豊富な変異を考慮することによって、登録データが不十分である場合であっても、また目標話者からの利用できるパスフレーズが1つだけであるとしても、ロバストなパスフレーズ・モデリングが確実なものとなる。
185 Apparatus and method for updating the speaker template JP2008275807 2008-10-27 JP5042194B2 2012-10-03 紫 三木; 雅美 野口
186 State detector, state detection method and program for state detection JP2010291190 2010-12-27 JP2012137680A 2012-07-19 HAYAKAWA SHOJI; MATSUO NAOJI
PROBLEM TO BE SOLVED: To provide a state detector for precisely detecting a state of a specific speaker while reducing the load.SOLUTION: The state detector includes first model generating means that generates, in the state detector, a first specific speaker model in which voice characteristics of a specific speaker under a non-suppression state are modeled for precisely detecting a state of the specific speaker by utilizing information included in the voice; second model generating means that generates a second specific speaker model in which voice characteristics of the specific speaker under a suppression state are modeled by reflecting a displacement to a second non-specific speaker model with respect to a first non-specific speaker model to the first specific speaker model based on the corresponding relationship information; likelihood calculation means that calculates a first likelihood representing the likelihood of the first specific speaker model with respect to input voice characteristics and a second likelihood representing the likelihood of the second specific speaker model with respect to an input voice; and state determination means that determines the state of the speaker of the input voice based on the first likelihood and the second likelihood.
187 話者認識装置、音響モデル更新方法及び音響モデル更新処理プログラム JP2009508804 2007-03-30 JPWO2008126254A1 2010-07-22 外山 聡一; 聡一 外山; 藤田 育雄; 育雄 藤田; 幸生 鴨志田
時間の経過とともに変化していく話者本人の発話音声の特徴に対応して、精度良く話者を認識することができる話者認識装置、音響モデル更新方法及び音響モデル更新処理プログラムを提供する。発話した話者が当該適応話者モデルに対応する登録話者であると判定された場合には、適応話者モデルを更新する。このとき、算出された音声特徴量を適応音声特徴量記憶部11に記憶させ、適応音声特徴量記憶部11に記憶された音声特徴量のうち、現時点から過去に遡ってK個の音声特徴量で初期話者モデルを適応処理を行うことによって、新たな適応話者モデルを作成し、この新たな適応話者モデルを登録話者モデル記憶部9に記憶させ、登録話者モデル記憶部9に記憶された新たな適応話者モデルを用いて、発話した話者が当該適応話者モデルに対応する登録話者であるか否かを判定する。
188 話者認識システムにおける話者モデル登録装置及び方法、並びにコンピュータプログラム JP2008507435 2007-03-16 JPWO2007111169A1 2009-08-13 外山 聡一; 聡一 外山
話者認識システム(1)における話者モデル登録装置(10)は、話者認識システムにおいて話者認識用の話者モデルを登録する。話者モデル登録装置は、発話をn+α(但し、nは2以上の整数、αは1以上の整数)回取得する取得手段(13)と、該取得されたn回の発話を登録用発話として、話者モデルの算出を行う算出手段(20)と、該算出が行われた話者モデルの照合を、前記取得されたα回の発話を照合用発話として行う照合手段(30)と、該照合が行われた話者モデルのうち、該照合の結果が所定基準を満たすものを、前記話者認識用の話者モデルとして登録する登録手段(40)とを備える。
189 Voice processing apparatus and program JP2006349210 2006-12-26 JP4305509B2 2009-07-29 靖雄 吉岡; 毅彦 川▲原▼
190 Automatic identification of telephone callers based on voice characteristics JP2005005572 2005-01-12 JP4221379B2 2009-02-12 パスコビチ アンドレイ
A method and apparatus are provided for identifying a caller of a call from the caller to a recipient. A voice input is received from the caller, and characteristics of the voice input are applied to a plurality of acoustic models, which include a generic acoustic model and acoustic models of any previously identified callers, to obtain a plurality of respective acoustic scores. The caller is identified as one of the previously identified callers or as a new caller based on the plurality of acoustic scores. If the caller is identified as a new caller, a new acoustic model is generated for the new caller, which is specific to the new caller. <IMAGE>
191 Speech processor and program JP2006349210 2006-12-26 JP2008158396A 2008-07-10 KAWAHARA TAKEHIKO; YOSHIOKA YASUO
PROBLEM TO BE SOLVED: To effectively reflect original features of a speaker on registered information. SOLUTION: A storage device 50 stores the registered information R including a feature quantity CA of a speech. A judgement unit 12 judges whether an input speech VIN is suitable as a speech for generating or updating the registered information R. Only when the judgement unit 12 judges that the input speech VIN is suitable, a managing unit 14 generates or updates the registered information R based upon the feature quantity CA of the input speech VIN. An informing unit 15 informs the speaker that the input speech VIN is not suitable when the judgement unit 12 judges so. COPYRIGHT: (C)2008,JPO&INPIT
192 Speaker authentication registration, confirmation method, and device JP2007099947 2007-04-06 JP2007279743A 2007-10-25 JIAN LUAN; DING PEI; HE LEI; HAO JIE
<P>PROBLEM TO BE SOLVED: To provide speaker authentication registration, a confirmation method and a device. <P>SOLUTION: The speaker authentication registration method provides extracting acoustic feature vector sequence from registered pronunciation of speakers; generating speaker templates by using the acoustic feature vector sequence; generating a filter bank for registered pronunciation of the speakers, by filtering the position of formants and energy for the spectrum of the registered pronunciation, based on the registered pronunciation by the above step extracting the acoustic feature vector sequence; filtering the spectrum of the registered pronunciation by the generated filter bank; and generating the acoustic feature vector sequence from the filtered registered pronunciation. <P>COPYRIGHT: (C)2008,JPO&INPIT
193 Speech biometrics system, method, and computer program for determining whether to accept or reject subject for enrollment JP2006009047 2006-01-17 JP2006285205A 2006-10-19 SUNDARAM PRABHA
<P>PROBLEM TO BE SOLVED: To provide a speech biometrics system and a method that determines whether to accept or reject a subject for enrollment in a biometric system. <P>SOLUTION: A template is generated from feature vectors extracted from a 1st instance of biometric input obtained from the subject. Feature vectors extracted from a 2nd instance of the biometric input obtained from the subject is compared with the template and a matching score is generated on the basis of the degree of similarity between the 1st instance and 2nd instance of the biometric inputs. When the matching score satisfies a designated reference threshold, the subject is accepted for enrollment in the biometric system. <P>COPYRIGHT: (C)2007,JPO&INPIT
194 Automatic identification of telephone caller based on voice characteristic JP2005005572 2005-01-12 JP2005227758A 2005-08-25 PASCOVICI ANDREI
<P>PROBLEM TO BE SOLVED: To provide a method and a device to identify a caller when a recipient receives a call from the caller. <P>SOLUTION: When a voice input from the caller is received by the recipient, the characteristics of the voice input are applied to a plurality of acoustic models which include a generic acoustic model and an acoustic model group of the callers who are previously identified and a plurality of respective acoustic scores is obtained. The caller is identified as one of the callers who are previously identified or a new caller based on the plurality of acoustic scores. When the caller is identified as a new caller, a new acoustic model is generated for the new caller and the model is specific to the new caller. <P>COPYRIGHT: (C)2005,JPO&NCIPI
195 Probabilistic matching method for speaker authentication JP6345198 1998-03-13 JP3630216B2 2005-03-16 ピー.リ キ
196 Of users of telecommunications lines in the telecommunications network, voice-activated identification method during the dialogue with the voice-activated dialogue system JP2001550740 2000-12-15 JP2003519811A 2003-06-24 ツィーム,トーマス; トリンケル,マリアン; ミュラー,クリステル; ルンジ,フレッド
(57)【要約】 本発明は、電気通信ネットワーク内の電気通信回線のユーザの、音声作動式対話システムとの対話中における音声作動式識別方法に関する。 本発明によれば、1本の電気通信回線に限定された発呼者群中からの発呼者による発話を、ヒューマン・ヒューマンおよび/またはヒューマン・マシン対話中に用いて、当該発呼者に対して参照パターンを適用する。 発呼者が識別されると駆動されるユーザ識別子は、参照パターン毎に格納されており、電気通信回線のCLIおよび/またはANI識別子とともに、音声制御式対話システムを有するサーバに提供される。 ユーザ識別子を含むCLIに基づき、このユーザについて以前に格納されたデータがシステムにより確認され、そして顧客との対話インターフェースに提供される。 本発明の方法は、音声制御式対話システムに用いられることが好ましい。
197 Speaker identification method JP2002133234 2002-05-08 JP2002372992A 2002-12-26 KEMP THOMAS
PROBLEM TO BE SOLVED: To reduce the burden or user in an enrolment phase and to maintain the high identification rate of an enrolled speaker. SOLUTION: In the speaker identification method, the enrollment speech of a speaker is collected and/or stored as initial identification speech data of identification speech data from which the speaker's speaker identification and/or classification data is generated and/or stored in a speaker database. The application speech uttered by the current speaker is received and evaluated by the speaker identification and/or classification data in the speaker database, and the current speaker is classified into at least a known speaker or an unknown speaker, and at least a portion of the application speech received from the known speaker is used as additional identification speech data.
198 Two-stage group selection method for speaker collation system JP20426197 1997-07-30 JPH1083194A 1998-03-31 GOLDENTHAL WILLIAM D; EBERMAN BRIAN S
PROBLEM TO BE SOLVED: To make it possible to collate whether a speaker is the requested person himself or not independently from the contents of a speech and in accordance with the spoken pronounce by decreasing the equiv. error rate of a speaker collation process for validating the recognition requested by the unknown speaker by using two stages of group selection techniques. SOLUTION: A training speech signal 102 is processed by a model generator 140, etc., and forms the set of acoustic models 150 for characterizing the training speech signal 102. In addition, thereto, the training speech signal 102 is compared with the set of the models of the other speakers and plural sets of the group models 170 for characterization are selected from the usable set of the acoustic models of the other speakers. The difference from the auxiliary sets in which the score compared with the acoustic model 150 corresponding to the requested recognition 201 and the score compared with the corresponding group models 170 is determined. This difference is compared with the threshold value by a validating means 260. Truth is given when this difference exceeds the threshold value.
199 Speaker verification system JP35970496 1996-12-20 JPH09218697A 1997-08-19 MARUKOOMU AI HANAA; ANDORIYUU TEII SAPERAKU; ROBAATO AI DANPAA; IAN EMU ROJIYAA
PROBLEM TO BE SOLVED: To provide a speaker verification system capable of efficiently verifying the speaker. SOLUTION: In a speaker verification system, a reference data generating method contains analysis of utterance as two or more templates of the speaker to be registered in the system, for providing a matrix of average cepstrum coefficient presenting utterance. This matrix is compared with the cepstrum coefficient matrix 102 obtained from the utterance for the comparison of the speaker in question with other speakers by using a genetic academic algorithm to determine coefficients of weighted vectors 104 used in the case of comparison. The reference data containing the coefficient of the genetic academic algorithm obtained from the average cepstrum coefficient matrix 102 and the weighted vectors 104, is recorded on the user card so that the speaker can use it thereafter when he wants to receive service from a self-service terminal. COPYRIGHT: (C)1997,JPO
200 JPH01502058A - JP50039987 1987-12-09 JPH01502058A 1989-07-13
QQ群二维码
意见反馈