首页 / 技术领域 / 音频分割 / 专利数据
序号 专利名 申请号 申请日 公开(公告)号 公开(公告)日 发明人
61 Audio word processor using divided utterance JP24398384 1984-11-19 JPS61121167A 1986-06-09 SATO YASUO; MATSUI HARUKI
PURPOSE: To improve the KANA (Japanese syllabary)/KANJI (Chinese character) conversion factor by informing silent time information on the voice input which is uttered in divisions with words or clauses to a KANA/KANJI conversion part from a voice recognition part together with the result of recognition and than dividing a subject of conversion into small sections by the silent time information. CONSTITUTION: The voice parameter of the sentence data supplied from a mike 1 is extracted by a voice parameter extracting part 2. Based on this extracted parameter, a sound/silence deciding part 3 decides the silent time information on the voice input and supplies it to a voice recognition part 5 as well as a voice time measuring part 4. The part 5 selects a CV voice series or a CV syllable candidate series according to a standard pattern given from a sound dictionary 6 and adds the selected series to a CV syllable candidate series memory part 8. Then the silent time measured by the part 4 and the result of recognition of the part 5 are added to a division mark inserting part 7. A KANA/KANJI conversion part 9 is controlled by the division mark given from the part 7. Then the candidate series to be converted and stored to the part 8 is divided into small sections. This improves the KANA/KANJI conversion factor. COPYRIGHT: (C)1986,JPO&Japio
62 SEGMENTATION AND TRANSMISSION OF AUDIO STREAMS EP06840473.0 2006-12-12 EP1961154A4 2016-03-09 MCCUE, John; MCCUE, Robert; SHOSTAKOVSKY, Gregory; MCCUE, Glenn
A method of downloading digital content to be rendered is provided in which a list of content servers that are capable of serving requested digital content is downloaded from a network accessible server. Service level statistics are tracked for the content servers in the list of content servers. A first content server to serve the requested digital content is selected from the list of content servers in dependence upon the service level statistics. A first segment of the requested digital content is downloaded from the first content server for rendering. In the event of a degradation in service, a second content server to replace the first content server is selected from the list of content servers in dependence upon the service level statistics, wherein the server replacement is substantially imperceptible. A second segment of the requested digital content is downloaded from the second content server for rendering.
63 Audio device, voice data dividing method and program JP2008133766 2008-05-22 JP2009282260A 2009-12-03 KIMURA SATOSHI
<P>PROBLEM TO BE SOLVED: To provide an audio device, an information processing method and a program, with which a context of the division position can be obtained and a division position can be easily checked. <P>SOLUTION: The audio device 100, in which a voice data can be divided in an arbitrary position, includes: an output section 80 for reproducing a voice data; an operation section 10 in which the division position of the reproduced voice data is indicated, and division of the voice data is determined at the indicated division position; and a control section 30 which reproduces the voice data of a fixed range before and after the indicated division position, when the division position is indicated by the operation section 10, and which divides the voice data at the determined division position, when the division position is determined by the operation section 10 during the reproduction. <P>COPYRIGHT: (C)2010,JPO&INPIT
64 주파수 세그먼트화를 이용한 오디오 코딩 및 디코딩을 위한 방법 및 장치 KR1020087001012 2006-07-14 KR101343267B1 2013-12-18 메로트라,산지브; 첸,웨이-게
주파수 세그먼트화는 스펙트럼 데이터의 인코딩 품질에 중요하다. 세그먼트화는 스펙트럼 데이터를 서브대역 또는 벡터로 불리는 유닛으로 분해하는 단계를 포함한다. 균질 세그먼트화는 차선택일 수 있다. 스펙트럼 데이터 세기 의존적인 세그먼트화를 제공하는 여러가지 특성들이 기술되어 있다. 더 큰 스펙트럼 변동의 영역들에 대해 더 미세한 세그먼트화가 제공되고, 더 균질한 영역들에 대해 더 조악한 세그먼트화가 제공된다. 유사한 특성들을 갖는 서브대역들은 품질에 거의 영향을 주지 않고 병합될 수 있는 반면, 매우 가변적인 데이터를 갖는 서브대역들은 서브대역이 분할되는 경우에 더 잘 표현될 수 있다. 서브대역의 음조, 에너지 또는 형상을 측정하기 위한 다양한 방법이 기술되어 있다. 이들 다양한 측정은 가변적인 주파수 세그먼트화를 제공하기 위하여 서브대역들을 언제 분할 또는 병합할 지에 대하여 결정을 행하는 관점에서 논의된다. 코딩, 코드워드, 스펙트럼 계수, 코드 벡터, 인코딩, 디코딩
65 OUT-OF-VOCABULARY WORD REJECTION ALGORITHMS USING PHONEME SEGMENTATION IN VARIABLE VOCABULARY WORD RECOGNITION PCT/KR2003/001200 2003-06-18 WO2004111998A1 2004-12-23 KANG, Chul Ho; LEE, In Hak

Disclosure is a phoneme segmentation method which is designed to improve the function of rejecting an out-of-vocabulary word using a phoneme segmentation by performing an utterance authentication using a phoneme segmentation based on a result recognized in the variable vocabulary word recognition. In an out-of-vocabulary word rejection apparatus using a phone segmentation in a variable vocabulary word recognition, an initial sound detector detects an initial sound of sampled input speech data. A middle sound detector detects a starting point of a middle sound of a voiced sound which exists every phoneme after the initial sound is detected by the initial sound detector. An unvoiced sound initial sound detector detects a start point of an initial sound of an unvoiced sound before the middle sound is detected by the middle sound detector. The detector detects a starting point of a phoneme which is detected by the middle sound detector and the unvoiced sound initial sound detector. An out-of-vocabulary word rejecting section performs an utterance authentication using the values derived from the initial sound detector, the middle sound detector, the unvoiced sound initial sound detector, and the detector.

66 Edit device using audio/video division method and synthesis method JP4860096 1996-03-06 JPH09247621A 1997-09-19 NINOMIYA MASAKO; FUJIOKA SOICHIRO; MITANI HIROSHI; NISHIDA MICHIFUMI; INAI MICHIFUMI
PROBLEM TO BE SOLVED: To attain higher speed processing by improving the efficiency of division and synthesis processing with a simple hardware and recording video data and other data than the video data on plural recorders while separating recording tracks for the data. SOLUTION: An edit device 101 is provided with an AUDIO/VIDEO division section 106a and an AUDIO/VIDEO synthesis section 107b and which apply division and synthesis processing to VIDEO data and data consisting of AUDIO data, SUBCODE data and V-AUX data through hardware processing. An AUDIO /VIDEO division section 103a and an AUDIO/VIDEO synthesis section 103b apply division and synthesis processing to data other than the VIDEO data through software processing. Divided digital compression data are continuously recorded to a VIDEO data recording area and a non-VIDEO data recording area of hard disk units a117, b118 respectively. COPYRIGHT: (C)1997,JPO
67 Method for segmenting audio signals US10907851 2005-04-18 US08521529B2 2013-08-27 Michael M. Goodwin; Jean Laroche
An input signal is converted to a feature-space representation. The feature-space representation is projected onto a discriminant subspace using a linear discriminant analysis transform to enhance the separation of feature clusters. Dynamic programming is used to find global changes to derive optimal cluster boundaries. The cluster boundaries are used to identify the segments of the audio signal.
68 Method of segmenting an audio stream US10370065 2003-02-21 US20030171936A1 2003-09-11 Mikhael A. Sall; Sergei N. Gramnitskiy; Alexandr L. Maiboroda; Victor V. Redkov; Anatoli I. Tikhotsky; Andrei B. Viktorov
Disclosed herein is a segmentation method, which divides an input audio stream into segments containing different homogeneous signals. The main objective of this method is localization of segments with stationary properties. This method seeks all no-stationary points or intervals in the audio stream and creates a list of segments. The obtained list of segments can be used as an input data for the following procedures, such as classification, speech/music/noise attribution and so on. The proposed segmentation method is based on the analysis of audio signal statistical features variation and comprises three main stages: stage of first-grade characteristics calculation, stage of second-grade characteristics calculation and stage of decision-making.
69 Audio signal segmentation algorithm US11589772 2006-10-31 US07774203B2 2010-08-10 Jhing-Fa Wang; Chao-Ching Huang; Dian-Jia Wu
The present invention discloses an audio signal segmentation algorithm comprising the following steps. First, an audio signal is provided. Then, an audio activity detection (AAD) step is applied to divide the audio signal into at least one noise segment and at least one noisy audio segment. Then, an audio feature extraction step is used on the noisy audio segment to obtain multiple audio features. Then, a smoothing step is applied. Then, multiple speech frames and multiple music frames are discriminated. The speech frames and the music frames compose at least one speech segment and at least one music segment. Finally, the speech segment and the music segment are segmented from the noisy audio segment.
70 Audio signal segmentation algorithm US11589772 2006-10-31 US20070271093A1 2007-11-22 Jhing-Fa Wang; Chao-Ching Huang; Dian-Jia Wu
The present invention discloses an audio signal segmentation algorithm comprising the following steps. First, an audio signal is provided. Then, an audio activity detection (AAD) step is applied to divide the audio signal into at least one noise segment and at least one noisy audio segment. Then, an audio feature extraction step is used on the noisy audio segment to obtain multiple audio features. Then, a smoothing step is applied. Then, multiple speech frames and multiple music frames are discriminated. The speech frames and the music frames compose at least one speech segment and at least one music segment. Finally, the speech segment and the music segment are segmented from the noisy audio segment.
71 Effective audio segmentation and classification US11578300 2005-06-06 US08838452B2 2014-09-16 Reuben Kan; Dmitri Katchalov; Muhammad Majid; George Politis; Timothy John Wark
A method (400) and system (200) for classifying a audio signal are described. The method (400) operates by first receiving a sequence of audio frame feature data, each of the frame feature data characterising an audio frame along the audio segment. In response to receipt of each of the audio frame feature data, statistical data characterising the audio segment is updated with the received frame feature data. The received frame feature data is then discarded. A preliminary classification for the audio segment may be determined from the statistical data. Upon receipt of a notification of an end boundary of the audio segment, the audio segment is classified (410) based on the statistical data.
72 Effective Audio Segmentation and Classification US11578300 2005-06-06 US20090006102A1 2009-01-01 Reuben Kan; Dmitri Katchalov; Muhammad Majid; George Politis; Timothy John Wark
A method (400) and system (200) for classifying a audio signal are described. The method (400) operates by first receiving a sequence of audio frame feature data, each of the frame feature data characterising an audio frame along the audio segment. In response to receipt of each of the audio frame feature data, statistical data characterising the audio segment is updated with the received frame feature data. The received frame feature data is then discarded. A preliminary classification for the audio segment may be determined from the statistical data. Upon receipt of a notification of an end boundary of the audio segment, the audio segment is classified (410) based on the statistical data.
73 Method of segmenting an audio stream US10370065 2003-02-21 US07346516B2 2008-03-18 Mikhael A. Sall; Sergei N. Gramnitskiy; Alexandr L. Maiboroda; Victor V. Redkov; Anatoli I. Tikhotsky; Andrei B. Viktorov
Disclosed herein is a segmentation method, which divides an input audio stream into segments containing different homogeneous signals. The main objective of this method is localization of segments with stationary properties. This method seeks all no-stationary points or intervals in the audio stream and creates a list of segments. The obtained list of segments can be used as an input data for the following procedures, such as classification, speech/music/noise attribution and so on. The proposed segmentation method is based on the analysis of audio signal statistical features variation and comprises three main stages: stage of first-grade characteristics calculation, stage of second-grade characteristics calculation and stage of decision-making.
74 SEGMENTING AUDIO SIGNALS INTO AUDITORY EVENTS EP02721201.8 2002-02-26 EP1393300A1 2004-03-03 CROCKETT, Brett, G.
In one aspect, the invention divides an audio signal into auditory events, each of which tends to be perceived as separate and distinct, by calculating the spectral content of successive time blocks of the audio signal (5-1), calculating the difference in spectral content between successive time blocks of the audio signal (5-2), and identifying an auditory event boundary as the boundary between successive time blocks when the difference in the spectral content between such successive time blocks exceeds a threshold (5-3). In another aspect, the invention generates a reduced-information representation of an audio signal by dividing an audio signal into auditory events, each of which tends to be perceived as separate and distinct, and formatting and storing information relating to the auditory events (5-4). Optionally, the invention may also assign a characteristic to one or more of the auditory events (5-5).
75 AUDIO CONTENT SEGMENTATION METHOD AND APPARATUS EP15877438.0 2015-01-15 EP3239982A1 2017-11-01 ZHOU, Wenyu; LI, Zijun; YANG, Fen

Embodiments of the present invention relate to the audio field and provide an audio content segmentation method and an apparatus, to capture audio content by means of interaction between user equipment and a server. The method includes: receiving a segmentation location message sent by user equipment; searching, according to an audio identifier of audio content, for at least one piece of second segmentation location information matching the audio identifier of the audio content; determining at least one piece of target segmentation location information from at least one piece of first segmentation location information and determining at least one piece of reference segmentation location information from the at least one piece of second segmentation location information according to the at least one piece of first segmentation location information and the at least one piece of second segmentation location information; determining at least one piece of third segmentation location information according to the at least one piece of target segmentation location information and reference segmentation location information corresponding to each piece of target segmentation location information; and sending a segmentation location recommendation message to the user equipment.

76 AUDIO SPLITTING WITH CODEC-ENFORCED FRAME SIZES EP10842739.4 2010-12-21 EP2517121A4 2016-10-12 OWEN, Calvin Ryan
77 SEGMENTING AUDIO SIGNALS INTO AUDITORY EVENTS EP02721201.8 2002-02-26 EP1393300B1 2012-11-28 CROCKETT, Brett, G.
78 AUDIO SPLITTING WITH CODEC-ENFORCED FRAME SIZES EP10842739.4 2010-12-21 EP2517121A1 2012-10-31 OWEN, Calvin Ryan
A method and apparatus for splitting the audio of media content into separate content files without introducing boundary artifacts is described.
79 時間分割多工數位音頻及視頻信號之記錄再生系統 TW07113056 1982-09-16 TW049901B 1983-04-16 柴本猛; 高島征一; 高橋宣明; 鈴木富士男 (等5人)
80 임의 접근 위치(RAP) 및 다중 예측 파라미터 세트(MPPS) 능력을 구비한 적응형 세그먼트화를 이용하는 무손실 다채널 오디오 코덱 KR1020107017781 2009-01-09 KR101612969B1 2016-04-15 페조조란
본발명은프레임내의특정세그먼트에서무손실디코딩을개시하기위한임의접근위치(RAP) 능력및/또는과도효과(transient effect)를완화하기위해분할된다중예측파라미터세트(MPPS) 능력을이용하여무손실가변비트레이트(VBR; variable bit rate) 비트스트림을인코딩/디코딩하는무손실오디오코덱을제공한다. 이는프레임내 검출된과도신호및/또는소망의 RAP의존재에의해부여되는제약에기초하여세그먼트시작점을고정하고, 인코딩된세그먼트페이로드제약에종속하여인코딩된프레임페이로드를줄이기위해각 프레임에서최적의세그먼트구간을선택하는적응형세그먼트화기술을이용하여달성된다 RAP 및 MPPS는특히보다긴 프레임구간에대해전체적인성능을개선하도록적용가능하다.
QQ群二维码
意见反馈