iCIMS has three major categories of data extraction: Audit, Reporting, Analytics.

iCIMS' taxonomy of Data Extraction for a CIS

iCIMS’ taxonomy of Data Extraction for a CIS

This three-way taxonomy of defining data extraction is important identification of the different classes of activities performed by users of Clinical Information Systems (CIS). The three data extraction strategies are:

  1. Data Audit is the process of searching for specific classes of data that exist at the CIS that shows the activity of users in creating, changing and viewing data.

  2. Data Reporting is identifying a subset of data that is required for some structural or periodic need and requires specific formatting and optionally descriptive statistics.

  3. Data Analytics which is used to answer complex questions which fall into five sub-types of queries identified as: ad hoc questions, hypothesis testing, scientific studies, semantic retrieval, and predictive modelling.

Data Audit

Data Audit is a function to view profiles of processing activities. The data should be available for All actions, Patient specific, and User specific cases. The first 3 operate across all CISs in a single installation or in just a single CIS. iCIMS provides a variety of options and retrievable variables for the audit function, including the original request type, passed parameters, content passed both identified by a hyperlink to the original source data, the Patient identifier, Request Date & Time, the institution of the logged in user, the user name and the CIS accessed.

Data Reporting

Reporting can mean a number of different processes depending on what is required. We make these distinctions:

Patient Record Export reporting should be a simple table export in any CIS into CSV (for spreadsheets) or XML format. Table exporting is about exporting all copies of a specified table or a group of table, for single or multiple patients as required. This process should be an intrinsic function of a CIS that is executed by selecting options in an reporting template and entering a date range if required. Basic variations of this function are:

  • Single Table (All Patients) Form Export: This function enables the export of a specific table (for example: Tumour Details) for all patients with an option of specifying a date range to restrict the results.
  • Patient Record Export: This exports the full patient record into a CSV or XML format. A Patient Record Export should be executable for a single patient or multiple patients.

Selective Reporting is retrieving either all or part of the records of patients defined by some criteria. This is often used for activities such as, case reviews, sharing data with research groups or providing data under statutory obligations to the Cancer Registry.

Tracking List Reporting is compiling a tracking list of patients according to some attributes in their patient record. It may be clinical or administrative content that is tracked. The notion of a tracking list is that patients on the list need to be monitored for some aspect of their health, either because they are in the process of care or are waiting for some action to be taken by them or on their behalf. Their reason for being on the tracking list is because they need to be kept in mind for some kind of attention.

Example of a Descriptive Statistics report

iCIMS allows you to design your own Descriptive Statistics report without knowledge of the storage tables

Descriptive Statistics is the compilation of some group of statistics over a set of patients. This is usually performed hand-in-hand with Selective Reporting. The statistics produce some type of general characterisation over the group of patients, or staff or clinical work in the set of selected records.

Compiling any of the above categories of reports requires designing a table that draws on fields from within the CIS. Descriptive Statistics are defined by statistical functions that aggregate appropriate fields in the tracking list.

Periodic Reporting is what is commonly referred to as “Standard Reports”. These are reports that have to be generated on a regular basis defined by a time period. These types of reports are generally required to abide by specific standards on inclusion criteria, disclosed data fields, and/or descriptive statistics. Periodic reports would generally consist of a tracking table that includes records of specific criteria for inclusion (more commonly known as “Conditions”). The fields required to be included from each record in the report which are selected from any part of the patient record regardless of any other conditions. If the report requires descriptive statistics, an additional design step would add the statistical fields that are configurable to run the desired calculations on the table.

Once a periodic report is designed as a form, it should become a standard report than can be generated by a click of a button on the CIS (potentially an additional step of applying a date range may apply). A report that is generated on the CIS should also be exportable to a CSV or XML format as required. Designing periodic reports under this model should also allow for the flexibility of altering the report conditions as required. As clinical work is continuously evolving, adapting to new changes and process improvement, reporting standards also undergo changes by governing bodies. Therefore, if a reporting standard is altered, the reporting form should always be modifiable to abide by the new standards in terms of altering field requirements or conditions.

Data Analytics

More advanced forms of reporting are provided by CliniDAL. This package provides for five more types of analysis and reporting.

iCIMS’ model of information extraction is built on the idea of four generic users: Point-of-care clinician, researcher clinician, administrator clinician, and clinical auditor. Whilst one can think of these as different people, we see them as different roles and have observed senior staff performing all four roles in a single day. Their identification is important for defining data extraction so as to define their processing requirements. The Point-of-care clinician needs record retrieval, and the next three user types need record aggregation plus the auditor needs record retrieval as well. This analysis then dictates that record aggregation is the most important analytic function of a CIS. Hence, the user needs to understand a great deal about the CIS to allow ready questioning across a range of topics. We define the data analytics needs of the clinician into a category scheme of:

  • Ad hoc query
  • Hypothesis testing
  • Scientific enquiry
  • Semantic concept identification
  • Predictive modelling

An ad hoc query is the need to ask a question once without expecting to ask it again. The variables used in an ad hoc query need to be readily recognisable through the CIS interface, and the query about them needs to be framed easily and recognisably by the clinician as semantically validly formed. Repeated use of a given ad hoc question might lead a user to develop an equivalent selective report if it was perceived it was needed frequently enough for routine work.

Hypothesis Testing is the standard process of asking for a comparison between two groups to be evaluated statistically. It is assumed the two groups can be separated by a small number of non-confounding variables.

Scientific Studies is the task of framing a set of questions to identify multiple cohorts in a systematic study and statistically test comparative variables for their variation between groups.

Semantic Concept Identification uses categorical variables defined in SNOMED CT to identify semantic concepts in text. It adds to the other types of analytics searching the free text components of the clinical record to retrieve the concepts to be used as data in the evaluation. The recognition of the free text concepts is a statistical NLP process that uses SNOMED CT descriptions as a representation of the semantics of the variables to be extracted. The accuracy of concept recognition is a function of the level of correspondence between the writing in the notes and the phrasal formations in SNOMED CT concept descriptions.

Predictive Modelling is the construction of evidenced based models used for determining patient prognoses.