• Login
    View Item 
    •   SMARTech Home
    • Georgia Tech Theses and Dissertations
    • Georgia Tech Theses and Dissertations
    • View Item
    •   SMARTech Home
    • Georgia Tech Theses and Dissertations
    • Georgia Tech Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Data Privacy in the Modern Machine Learning Ecosystem

    Thumbnail
    View/Open
    TRUEX-DISSERTATION-2021.pdf (8.177Mb)
    Date
    2021-04-30
    Author
    Truex, Stacey
    Metadata
    Show full item record
    Abstract
    The explosion of data collection and advances in artificial intelligence and machine learning have motivated a robust economy around cloud-based machine learning services. While such services provide opportunities for a broad array of individuals and companies to leverage the power of modern machine learning they also introduce new vulnerabilities and privacy risks, such as membership inference attacks, attribute inference attacks, and data poisoning attacks. Such attacks can allow for the malicious manipulation of model outcomes and can cause serious violations to the data privacy of those individuals who have contributed to the model learning process. Federated learning (FL) is a decentralized collaborative machine learning paradigm developed in response to such privacy risks. In FL, machine learning models are trained via multiple rounds of communication using a distributed computing platform. This allows FL participants to share only their local training model updates, therefore allowing individual participants to keep private training data to remain local. While FL systems protect raw data from explicit disclosure, such systems remain vulnerable to inference based privacy risks, such as membership or attribute inference attacks, as well as private training data poisoning attacks. This dissertation research is dedicated to making original contributions towards addressing the growing public concern and legislative action surrounding data privacy in the modern machine learning ecosystem. This dissertation research first takes a holistic approach to create a structured and comprehensive analysis of privacy risks in machine learning including a characterization of privacy vulnerabilities in both centralized and decentralized settings, an in-depth study on inference-based privacy attacks, specifically membership inference, against machine learning models, and a framework for evaluating membership inference risks in machine learning. The second contribution is the development of a privacy-preserving machine learning solutions. This includes an analysis on privacy-preserving techniques in machine learning as well as protocols for the private training and evaluation of machine learning models under formal privacy frameworks including differential privacy and secure multiparty computation. The next contribution consists of an analysis of the challenges and system considerations for extending privacy protection to the growing domain of federated learning. The final contribution is a proposed architecture, TSC-PFed, for trust and security enhanced customizable private federated learning. To this end, we propose the development of a privacy-enhanced federated learning system which incorporates both differential privacy and secure multiparty computation (SMC) to privately train accurate predictive models. Within our TSC-PFed system we include support for considering trust dynamics within a federated learning system which allow FL participants to decrease the degree of noise injected locally by a customizable trust factor $t$ while still adhering to a global differential privacy guarantee. We additionally provide support for security enhancements as well as customizable settings which allow participants to tune the type and level of privacy provided by TSC-PFed.
    URI
    http://hdl.handle.net/1853/66438
    Collections
    • College of Computing Theses and Dissertations [1156]
    • Georgia Tech Theses and Dissertations [23403]

    Browse

    All of SMARTechCommunities & CollectionsDatesAuthorsTitlesSubjectsTypesThis CollectionDatesAuthorsTitlesSubjectsTypes

    My SMARTech

    Login

    Statistics

    View Usage StatisticsView Google Analytics Statistics
    facebook instagram twitter youtube
    • My Account
    • Contact us
    • Directory
    • Campus Map
    • Support/Give
    • Library Accessibility
      • About SMARTech
      • SMARTech Terms of Use
    Georgia Tech Library266 4th Street NW, Atlanta, GA 30332
    404.894.4500
    • Emergency Information
    • Legal and Privacy Information
    • Human Trafficking Notice
    • Accessibility
    • Accountability
    • Accreditation
    • Employment
    © 2020 Georgia Institute of Technology