I propose four network architectures for improving security in machine learning model training where training data must remain secure from the model trainer and the model must be secure from the data owner. The architectures are numbered 1 through 4 with the first being the least secure and 4 being the most. Architecture 4 is the final iteration implemented in practice and facilitates end-to-end security for the Provider and one or more Consumer’s.Each architecture is comprised of multiple processes running concurrently and performing distinct tasks. The processes are deployed in docker containers with strictly defined networks to allow communications between processes only where necessary to uphold security objectives. Techniques such as attestation and cryptography are applied among various other technologies including FarmStack Trusted Connectors and Docker. All of which were employed to reduce data and application security threats further.
The important contributions
• Plausible architectures, which provide security to the Provider of the raw data and Con- sumer of that data, for training models in the agricultural industry.
• A previously untested application of FarmStack Trusted Connectors.
• The development of the Internal Data Handler (IDH), which replaced disk storage as an
internal cryptographic file storage server.
My results prove that of the range of architectures described in this paper, the only worth deploying are architectures 2 and 4. Architecture 2 has far from the same security guaran- tees as architecture 4 but is approximately 3.7 times faster than architecture 4 in its current state. Architecture 4, while slower, addresses all objectives of the threat model. There are many objectives in the threat model not addressed by Architecture 2. Hence, Architecture 4 is recommended over Architecture 2.