Food for Thought Experiment | Privacy and Computational Control Through Blockchain2021-02-23T14:08:31+05:30

Food for Thought Experiment | Privacy and Computational Control Through Blockchain

Blockchain’s capabilities find their implementations and applications in multiple domains, including (but not limited to) finance, tourism, social media, digital technology, and healthcare. Blockchain also enables data scientists to get more reliable data and use the capabilities of distributed computing. [1] Additionally, open-source communities like OpenMined provide the capabilities to run experiments on a blockchain environment.

Data Reliability

Blockchain improves the end-to-end process of data handling, including collecting data, enabling data traceability, and supporting real-time data analysis. It helps guarantee data quality, as this technology can bypass intermediate sources of data error.

Distributed Computing

When analyzing huge amounts of data, individuals rely on Cloud computing options (e.g. GCP and Azure services). However, decentralized computing through blockchain would reduce the computation cost and would eliminate the need for third-party vendors. [2] The Ethereum-based Golem project is one such experiment that would give users the power to purchase computing resources from people with idle computers.

OpenMined – Data Science Project

OpenMined’s goal is to make the world’s data more private and secure. Specifically, this open-source community is lowering AI’s entry barriers by allowing persons and companies to host private datasets that data scientists can use for training or querying but cannot access. [3]

OpenMined uses extended PyTorch, Tensorflow, and Keras capabilities in its two libraries:

  • PYSYFT, which enables Python-based machine learning with remote execution, federated learning, and encryption.
  • PYGRID, which allows entities to host their data in the Cloud.

Traditionally, data science is limited to centralized computation, usually on a single cluster. OpenMined enables us to create a machine learning model that can be governed by multiple users and trained on an unseen dataset. This allows data scientists to train their models remotely, without actually having access to the data, by using Federated learning and on-device prediction. [3] Encrypted computations enable secure computations on foreign environments, and a differential policy helps control the level of access that you may need in the final predictions; these could be statistical results or predictions of the model.

References

  1. Blockchain Applications from Connectbit
  2. Blockchain projects on Github
  3. PYSYFT and PYGRID Library Code Documentation
-Authored by Rajat Bansal, Data Scientist at Absolutdata