MPEG-7 Audio and Beyond

pdf
Số trang MPEG-7 Audio and Beyond 279 Cỡ tệp MPEG-7 Audio and Beyond 5 MB Lượt tải MPEG-7 Audio and Beyond 0 Lượt đọc MPEG-7 Audio and Beyond 0
Đánh giá MPEG-7 Audio and Beyond
4.2 ( 5 lượt)
Nhấn vào bên dưới để tải tài liệu
Đang xem trước 10 trên tổng 279 trang, để tải xuống xem đầy đủ hãy nhấn vào bên trên
Chủ đề liên quan

Nội dung

Simpo PDF Merge and Split Unregistered Version - ht MPEG-7 Audio and o PDF Merge and Split Unregistered Version - http://www.simpopdf.com Beyond Audio Content Indexing and Retrieval Hyoung-Gook Kim Samsung Advanced Institute of Technology, Korea Nicolas Moreau Technical University of Berlin, Germany Thomas Sikora Communication Systems Group, Technical University of Berlin, Germany o PDF Merge and Split Unregistered Version - http://www.simpopdf.com MPEG-7 Audio and Beyond o PDF Merge and Split Unregistered Version - http://www.simpopdf.com o PDF Merge and Split Unregistered Version - http://www.simpopdf.com MPEG-7 Audio and o PDF Merge and Split Unregistered Version - http://www.simpopdf.com Beyond Audio Content Indexing and Retrieval Hyoung-Gook Kim Samsung Advanced Institute of Technology, Korea Nicolas Moreau Technical University of Berlin, Germany Thomas Sikora Communication Systems Group, Technical University of Berlin, Germany Copyright © 2005 John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England Telephone (+44) 1243 779777 Email (for orders and customer service enquiries): cs-books@wiley.co.uk Visit our Home Page on www.wiley.com All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system or o PDF Merge and inSplit Unregistered Versionmechanical, - http://www.simpopdf.com transmitted any form or by any means, electronic, photocopying, recording, scanning or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1T 4LP, UK, without the permission in writing of the Publisher. Requests to the Publisher should be addressed to the Permissions Department, John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England, or emailed to permreq@wiley.co.uk, or faxed to +44 1243 770620. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the Publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought. Other Wiley Editorial Offices John Wiley & Sons Inc., 111 River Street, Hoboken, NJ 07030, USA Jossey-Bass, 989 Market Street, San Francisco, CA 94103-1741, USA Wiley-VCH Verlag GmbH, Boschstr. 12, D-69469 Weinheim, Germany John Wiley & Sons Australia Ltd, 42 McDougall Street, Milton, Queensland 4064, Australia John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop #02-01, Jin Xing Distripark, Singapore 129809 John Wiley & Sons Canada Ltd, 22 Worcester Road, Etobicoke, Ontario, Canada M9W 1L1 Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. Library of Congress Cataloging in Publication Data Kim, Hyoung-Gook. Introduction to MPEG-7 audio / Hyoung-Gook Kim, Nicolas Moreau, Thomas Sikora. p. cm. Includes bibliographical references and index. ISBN-13 978-0-470-09334-4 (cloth: alk. paper) ISBN-10 0-470-09334-X (cloth: alk. paper) 1. MPEG (Video coding standard) 2. Multimedia systems. 3. Sound—Recording and reproducing—Digital techniques—Standards. I. Moreau, Nicolas. II. Sikora, Thomas. III. Title. TK6680.5.K56 2005 006.6 96—dc22 2005011807 British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN-13 978-0-470-09334-4 (HB) ISBN-10 0-470-09334-X (HB) Typeset in 10/12pt Times by Integra Software Services Pvt. Ltd, Pondicherry, India Printed and bound in Great Britain by TJ International Ltd, Padstow, Cornwall This book is printed on acid-free paper responsibly manufactured from sustainable forestry in which at least two trees are planted for each one used for paper production. Contents o PDF Merge and Split Unregistered Version - http://www.simpopdf.com List of Acronyms xi List of Symbols xv 1 2 Introduction 1 1.1 Audio Content Description 1.2 MPEG-7 Audio Content Description – An Overview 1.2.1 MPEG-7 Low-Level Descriptors 1.2.2 MPEG-7 Description Schemes 1.2.3 MPEG-7 Description Definition Language (DDL) 1.2.4 BiM (Binary Format for MPEG-7) 1.3 Organization of the Book 2 3 5 6 9 9 10 Low-Level Descriptors 13 2.1 Introduction 2.2 Basic Parameters and Notations 2.2.1 Time Domain 2.2.2 Frequency Domain 2.3 Scalable Series 2.3.1 Series of Scalars 2.3.2 Series of Vectors 2.3.3 Binary Series 2.4 Basic Descriptors 2.4.1 Audio Waveform 2.4.2 Audio Power 2.5 Basic Spectral Descriptors 2.5.1 Audio Spectrum Envelope 2.5.2 Audio Spectrum Centroid 2.5.3 Audio Spectrum Spread 2.5.4 Audio Spectrum Flatness 2.6 Basic Signal Parameters 2.6.1 Audio Harmonicity 2.6.2 Audio Fundamental Frequency 13 14 14 15 17 18 20 22 22 23 24 24 24 27 29 29 32 33 36 o PDF vi CONTENTS 2.7 Timbral Descriptors 38 2.7.1 Temporal Timbral: Requirements 39 2.7.2 Log Attack Time 40 2.7.3 Temporal Centroid 41 2.7.4 Spectral Timbral: Requirements 42 2.7.5 Harmonic Spectral Centroid 45 Merge and Split Version - http://www.simpopdf.com 2.7.6 Unregistered Harmonic Spectral Deviation 47 2.7.7 Harmonic Spectral Spread 47 2.7.8 Harmonic Spectral Variation 48 2.7.9 Spectral Centroid 48 2.8 Spectral Basis Representations 49 2.9 Silence Segment 50 2.10 Beyond the Scope of MPEG-7 50 2.10.1 Other Low-Level Descriptors 50 2.10.2 Mel-Frequency Cepstrum Coefficients 52 References 55 3 Sound Classification and Similarity 59 3.1 3.2 59 61 61 62 63 65 66 66 68 70 71 73 3.3 3.4 3.5 3.6 3.7 Introduction Dimensionality Reduction 3.2.1 Singular Value Decomposition (SVD) 3.2.2 Principal Component Analysis (PCA) 3.2.3 Independent Component Analysis (ICA) 3.2.4 Non-Negative Factorization (NMF) Classification Methods 3.3.1 Gaussian Mixture Model (GMM) 3.3.2 Hidden Markov Model (HMM) 3.3.3 Neural Network (NN) 3.3.4 Support Vector Machine (SVM) MPEG-7 Sound Classification 3.4.1 MPEG-7 Audio Spectrum Projection (ASP) Feature Extraction 3.4.2 Training Hidden Markov Models (HMMs) 3.4.3 Classification of Sounds Comparison of MPEG-7 Audio Spectrum Projection vs. MFCC Features Indexing and Similarity 3.6.1 Audio Retrieval Using Histogram Sum of Squared Differences Simulation Results and Discussion 3.7.1 Plots of MPEG-7 Audio Descriptors 3.7.2 Parameter Selection 3.7.3 Results for Distinguishing Between Speech, Music and Environmental Sound 74 77 79 79 84 85 85 86 88 91 o PDF CONTENTS vii 3.7.4 Results of Sound Classification Using Three Audio Taxonomy Methods 92 3.7.5 Results for Speaker Recognition 96 3.7.6 Results of Musical Instrument Classification 98 3.7.7 Audio Retrieval Results 99 3.8 Conclusions 100 Merge and References Split Unregistered Version - http://www.simpopdf.com 101 4 5 Spoken Content 103 4.1 Introduction 4.2 Automatic Speech Recognition 4.2.1 Basic Principles 4.2.2 Types of Speech Recognition Systems 4.2.3 Recognition Results 4.3 MPEG-7 SpokenContent Description 4.3.1 General Structure 4.3.2 SpokenContentHeader 4.3.3 SpokenContentLattice 4.4 Application: Spoken Document Retrieval 4.4.1 Basic Principles of IR and SDR 4.4.2 Vector Space Models 4.4.3 Word-Based SDR 4.4.4 Sub-Word-Based Vector Space Models 4.4.5 Sub-Word String Matching 4.4.6 Combining Word and Sub-Word Indexing 4.5 Conclusions 4.5.1 MPEG-7 Interoperability 4.5.2 MPEG-7 Flexibility 4.5.3 Perspectives References 103 104 104 108 111 113 114 114 121 123 124 130 135 140 154 161 163 163 164 166 167 Music Description Tools 171 5.1 Timbre 5.1.1 Introduction 5.1.2 InstrumentTimbre 5.1.3 HarmonicInstrumentTimbre 5.1.4 PercussiveInstrumentTimbre 5.1.5 Distance Measures 5.2 Melody 5.2.1 Melody 5.2.2 Meter 5.2.3 Scale 5.2.4 Key 171 171 173 174 176 176 177 177 178 179 181
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.