Task Overview
Understanding the architecture of a Task in DecentralML
The DecentralML framework is built around a concept called "Tasks". These tasks are central to its operation, allowing institutions, corporations, blue-chips and open-source models, to use all computing resources in the network, this may include mobile or other IoT devices to perform valuable machine learning operations. In addition, as the Tasks are recorded on chain they provide a powerful audit trail which is hugely valuable when trying to version ML Models. This ability could eventually be useful for rolling-forward or backward models thus is particularly useful for various machine learning operations (MLOps), providing essential auditing functions.
Each task in DecentralML is linked to a specific role within the machine learning workflow and each role is associated with a particular Docker designed for that task. These Dockers may contain fully featured user-interfaces which would be useful in the case of Data Annotators. The ultimate goal is to automate the execution of these WASAM based Dockers and the Off-Chain work pallet. This automation would enable powerful and efficient processing, ideal for optimized decentralised server farm environments for private corporates, blue-chips, institutions or with open-source models could provide powerful unique alternatives.
For example, the Task for the role of Model Contributors involves downloading a Docker container and using it to execute data science tasks, in our example, particularly those related to training machine learning models. As a side benefit, by using Docker, Model Contributors can keep their data securely on their own device (node) and participate in a process called Federated Learning. This process allows them to synchronise the learning weights of their models with others in the network without sharing the actual data, ensuring privacy and security. They are then rewarded, with a runtime configurable Currency. In our first deliverable we have Model Contributors whom perform these data science sub-tasks:
Download the new labelled data that would be used for training
Download the latest model to train.
Train the latest model using the new data
Return the new model's weight as part of the task's results
As a second example, the role "Data Annotators" represents those individuals that participate in the decentralized machine learning pipeline in the capiticy of annotators of new data items that are provided by the Model Creator. For example, they might receive a set of images in which to identify specific objects, or recognise a specific song in a longer recording or podcast, or draw the bounding box around specific objects in a video or image.
Like Model Contributors, Data Annotators are rewarded in a predefined Currency for their contributions. This system encourages participation by offering tangible rewards for tasks that are crucial in the machine learning process.
Starting a task in DecentralML is straightforward. It begins with a call to the create_task extrinsic method within the decentralml_pallet. We implemented this using the substrateinterface API and provided a Python script in the docker environment Decentralml_docker.
Last updated