• Login
    View Item 
    •   SMARTech Home
    • Georgia Tech Theses and Dissertations
    • Georgia Tech Theses and Dissertations
    • View Item
    •   SMARTech Home
    • Georgia Tech Theses and Dissertations
    • Georgia Tech Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Automatic speaker verification and diarization on VoxCeleb data collection

    Thumbnail
    View/Open
    YANG-THESIS-2020.pdf (19.70Mb)
    Date
    2020-04-21
    Author
    Yang, Yufeng
    Metadata
    Show full item record
    Abstract
    Automatic speaker verification (ASV) is increasingly getting more attention in speech research field in recent years. Because of the importance of cyber-security and personal property security, ASV can be used in many fields in the future in addition to fingerprint and face information. In ASV research, a variety of datasets are needed to train good models. Current datasets include NIST SRE, VoxCeleb, etc. In this work, to collect a non-English speaking dataset, the pipeline of VoxCeleb data collection is adopted to collect an East Asian language-speaking Celebrities (EACeleb) dataset. To remove some noisy segments of the output and make the dataset cleaner, speaker diarization is used in this research and the collected data is filtered. Due to the lack of ground truth labels of the collected data, ASV is used to measure the data cleanness improvement of our dataset. Equal error rate (EER) can be lowered by 25.63% after speaker diarization compared to the original EACeleb using a pretrained x-vector model for measurement. Also, by training the speaker verification using EACeleb data, when testing the EER performance, EACeleb after diarization can outperform VoxCeleb by 36.78%.
    URI
    http://hdl.handle.net/1853/62830
    Collections
    • Georgia Tech Theses and Dissertations [23403]
    • School of Electrical and Computer Engineering Theses and Dissertations [3303]

    Browse

    All of SMARTechCommunities & CollectionsDatesAuthorsTitlesSubjectsTypesThis CollectionDatesAuthorsTitlesSubjectsTypes

    My SMARTech

    Login

    Statistics

    View Usage StatisticsView Google Analytics Statistics
    facebook instagram twitter youtube
    • My Account
    • Contact us
    • Directory
    • Campus Map
    • Support/Give
    • Library Accessibility
      • About SMARTech
      • SMARTech Terms of Use
    Georgia Tech Library266 4th Street NW, Atlanta, GA 30332
    404.894.4500
    • Emergency Information
    • Legal and Privacy Information
    • Human Trafficking Notice
    • Accessibility
    • Accountability
    • Accreditation
    • Employment
    © 2020 Georgia Institute of Technology