Note: This is the second post of the series Data Tagging in Medical Imaging. You can find the first post here.

In the previous post of the series, Data Tagging in Medical Imaging, we gave you an overview of the kind of processes that you must put in practice to scale your data tagging engine. In this blog we will thoroughly discuss how to come up with these processes and things to consider before finalizing and formulating these processes. We will be discussing what these processes are and how they affect the data tagging process.

First of all, you need some resources to set up these processes. To set up them up, you need to make sure the availability of the following:

So, let’s deep dive into these processes; what are they and what is their importance and role in efficient data tagging. We have listed down three main processes. Ideally, this is all one large workflow process designed to help you annotate your medical dataset in a faster, scalable and accurate way.

Let’s discuss these aforementioned processes in details below:

Data Flow

It is really critical to define how the data will flow through all the stakeholders at different stages.

Quality Assurance/Quality Control

This is perhaps the most important process in your workflow. The entire purpose of this exercise is to create a dataset that could be used to train the machines. The models that you create are only as accurate as the training dataset. Following are the things that you could do to create a strong QA/QC layer in your workflow.

Providing annotated data to the Data Scientists:

This is final layer — and there is a lot of scope for miscommunication in this layer. The entire purpose of annotating all the medical data is to provide it to the Data Science team for them to build the model. The Data Science team is always running a number of experiments for which they need a varied amount of data. In between all the experiments, it is very easy for the Data Scientists to be lost with a stream of annotated data coming from the Tagging Engine. This makes maintenance of data flow logs really important.

In this blog, we discussed the initial requirements of setting up a scalable tagging engine and core processes involved. Processes such as Data flow, QA/QC and training of medical professionals is critical in deploying a smart tagging engine. In achieving this, the proper formulation of the aforementioned processes is critical.

In the next blog of this series, we will discuss the most important aspect of the data tagging capabilities, Quality Control. You can find the first blog of the series here. Please watch this space for more.

ParallelDots AI APIs, is a Deep Learning powered web service by ParallelDots Inc, that can comprehend a huge amount of unstructured text and visual content to empower your products. You can check out some of our text analysis APIs and reach out to us by filling this form here or write to us at [email protected].