ABSTRACT
Information systems are employed by
organizations for the collection,
filtering, processing, creation and distribution
of data. In healthcare delivery, patients are required to share
information with certain categories of health personnel to facilitate correct
diagnosis and to determine appropriate treatment. There have been cases of
unauthorized access to patient information by health personnel. Some of these
personnel eventually cause great harm to the patient by divulging sensitive
information. The existing Data Privacy Preservation (DPP) models are designed
for Clinical Decision Support Systems with inadequate information available for
DPP in Health Information Systems (HIS) in Nigeria. This research, therefore
focused on the development of a model for Data Privacy Preservation (DPP) in
HIS to address this inadequacy.
A model for DPP in HIS was developed using the iterative design
technique. The model developed comprises a local database that
contains the health information of patients, the Random Forest Decision Tree
(RFDT) algorithm, an attribute blocking module that employs the RFDT algorithm,
an attribute unblocking module which also uses the RFDT algorithm and a module
for the computation of time elapsed in unblocking attributes. Mandatory
Role-based Access Control was used to restrict the access health professionals
have to patient data; each category of health worker can only view the
attribute(s) needed for them to provide the service required to fulfill their
role. An application based on the RFDT algorithm, was developed to instantiate
the model following the Waterfall Software Development Life Cycle. Netbeans
Integrated Development Environment, MySQL server, Java Development Kit 8,
Scenebuilder 2.0, and Navicat 8 query editor constitute the programming
environment. The application was evaluated against the machine learning
approach to DPP that employed the classification technique, by comparing its
efficiency with the Waikato Environment
for Knowledge Analysis (WEKA) version 3.8 software in ensuring DPP using
the RFDT algorithm.
The model developed in this study provides
a generic framework for DPP in HIS that reveals the necessary components. This
model provides a template that could be adapted for use in studies on DPP in
HIS. The application provides the health personnel with Graphical User
Interfaces that depict the professional’s access to the patient database while
restricting access to attributes not allowed for such category of health
workers. The use of the RFDT
algorithm in WEKA for DPP gave an
efficiency of 73.77% while the approach that employed the application gave an
efficiency of 78.32%.
The model presented in this study wouldhelp preserve
sensitive patient data from being accessed by health workers who are not
authorized to do so. The study showed that
the application is more efficient than the WEKA software in ensuring DPP using
the RFDT algorithm.The DPP model
proposed in this study could also be employed in other domains outside the
health sector to curb the challenges resulting from weak DPP.
TABLE OF
CONTENTS
Title
Page
Abstract
Table
of Contents
List
of Tables
List
of Figures
CHAPTER ONE: INTRODUCTION
1.1
Background to the Study
1.2
Statement of the Problem
1.3
Objective of the Study
1.5
Justification for the Study
1.6
Scope of the Study
1.7
Operational Definition of Terms
CHAPTER TWO: REVIEW OF
LITERATURE
2.0
Introduction
2.1
Schizophrenia
2.1.1
Symptoms of Schizophrenia
2.1.2
Factors that cause Schizophrenia
2.2
Clinical Decision Support Systems (CDSS)
2.2.1
Types of Clinical Decision Support Systems (CDSS)
2.2.1.1
Knowledge Based CDSS
2.2.1.2 Non Knowledge Based CDSS
2.2.2
Modern Trends of Implementing Clinical Decision Support Systems
2.2.2.1
Statistical Method
2.2.2.2
Hybrid Systems
2.3
Data Privacy
2.3.1
Data Privacy Preserving Methods
2.3.1.1
Privacy Preserving Data Mining
2.3.1.2
Privacy Preserving Data Mining Tasks and Algorithms
2.3.1.3
Identification, Authentication and Authorization
2.4
Role Based Access Control
2.5
Review of Closely Related Works
2.5.1
Cryptograhic Approach to DPP
2.5.2
Machine Learning Approach to DPP
2.5.3
Data Privacy Preservation Models
CHAPTER THREE: METHODOLOGY
3.0
Introduction
3.1
Research Design
3.1.1
Design of Experiment
3.1.2
Variable Selection
3.2Research
Methods
3.3
Machine Learning Model Design – Algorithms
3.4
Proposed Model for the Preservation of Patient Data Privacy
3.4.1
Design of the Application
CHAPTER FOUR: DATA ANALYSIS,
RESULTS AND DISCUSSION OF FINDINGS
4.0
Introduction
4.1
Blocking of Attributes across the Different Health Professional Categories
Using WEKA
4.2
Blocking of Attributes across the Different Health Professional Categories
Using the
Application
(Schizoapp)
4.3
Unblocking of Attributes across the Different Health Professional Categories
Using WEKA
4.4
Unblocking of Attributes across the Different Health Professional Categories
Using the
Application
(Schizoapp)
CHAPTER FIVE: SUMMARY,
CONCLUSION AND RECOMMENDATION
5.0
Introduction
5.1
Summary
5.2
Conclusion
5.3
Recommendations
5.4
Suggestions for Further Studies
5.5
Contribution to Knowledge
5.6
Ethical Consideration
5.7
Post Research Benefits
References
APPENDIX
CHAPTER ONE
INTRODUCTION1.1 Background to the Study
Health Information Systems(HIS) provide the bedrock
for decision-making and has four key
functions: data generation, compilation, analysis
and synthesis, and communication and use. The HIS gathers data from the health
sector and other relevant sectors, analyzes the data and ensures their overall relevance,
quality, and timeliness, and converts data into information for health-related
decision-making.In addition to being essential for
monitoring and evaluation, the information system also providesearly warning
capability,supports patient and health facility management, facilitate
planning, supports and stimulatesresearch, permits health situation and trends
analysis, supports global reporting, andunderpins communication of health
challenges to diverse users (WHO, 2009).
To
improve the quality of medical care around the globe,efforts are being made to
increase the practice of evidence-based medicinethrough the use of an HIS
called Clinical Decision Support Systems (CDSS). Clinical Decision Support
provides clinicians, patients, or caregivers with clinical knowledge and
patient-specific information to help them reach decisions that enhance patient
care (Osheroff, Teich & Middleton, 2011). The patient’s information is
matched to a clinical knowledge base, and patient-specific appraisals are then
communicated effectively at appropriate times during patient care. Some CDSS
include forms and templates for entering and documenting patient information,
and alerts, reminders, and order sets for providing suggestions and other
support. The use of CDSS comes with many potential benefits. Importantly, CDSS
can increase adherence to evidence-based medical knowledge and can reduce
unnecessary variation in clinical practice. CDSS can also assist with
information management to support the physicians’ decision making abilities,
reduce their mental workload, and improve clinical workflows (Karsh et al., 2010). When well designed and
implemented, CDSS have prospects that can improve health care quality, and also
to increase efficiency and reduce health care costs (Berner, 2010).
Despite
the promise of CDSS, there are several barriers that can hinder their development
and implementation. Till date, Medical knowledge base is incomplete in part
because of insufficient clinical evidence (Englander & Carraccio, 2014).
Moreover, methodologies are still being designed to convert the knowledge base
into computable code, and interventions for conveying the knowledge to
clinicians in a way they can easily usein practice are in the nascent stages of
development. Low clinician demand for Clinical Decision Support is another
encumbrance to broader CDSS adoption. Clinicians’ lack of motivation to use
CDSS appears to be related to usability issues with the Clinical Decision
Support intervention, its lack of integration into the clinical workflow,
concerns about autonomy, and the legal and ethical implications of adhering to
or overriding recommendations made by the CDSS (Berner, 2010). In addition, in
many cases, acceptance and use of CDSS are hinged uponthe adoption of
electronic medical records (EMRs), because EMRs can include Clinical Decision
Support applications as part of Computerized Provider Order Entry (CPOE) and electronic
prescribing systems.
One
of the five recommendations made for CDSS in connection with the practice of
Evidence-based Medicine was to “develop maintainable technical and
methodological foundations for computer-based decision support” (Sim, Gorman
& Greenes, 2011). Also, the medical domain is “characterized by much judgmental
knowledge”. Consequently, a CDSS that can provide suggestive knowledge
representations based on data sets with patient attributes that are synonymous
with the attributes of the patient in context is valuable to a medical
practitioner. Invariably, there are situations where the number of local
samples to draw conclusions from, is none or few. Several current challenges have not been sufficiently addressed during
the development of CDSS. From latest research, the lists of challenges include:
improvement of the human-computer interface, dissemination of best practices in
CDSS design, development, and implementation, creation of an architecture for
sharing executable CDSS modules and services, combination of recommendations
for patients with co-morbidities,summary of patient-level information,
prioritization and filtering of recommendations to the user, prioritization of
CDSS content development and implementation, creation of Internet-accessible
clinical decision support repositories, usage of free text information to drive
clinical decision support, and mining of huge clinical databases to create new
CDSS (Kumar&Prabha, 2016).
Psychiatry is one branch of medicine
that urgently needs HIS owing to the fact that there are relatively few
specialists in that area of medicine (Saha,Chant, Welham, & McGrath, 2015).
According to the National Alliance on Mental Illness, mental illnesses are
medical conditions that disrupt a person’s clear thinking, feeling, mood,
ability to relate to others, decision making ability and daily functioning
(NAMI, 2011). Mental illnesses include schizophrenia, depression, bipolar
disorder, obsessive-compulsive disorder (OCD), posttraumatic stress disorder
(PTSD), borderline personality disorder, anxiety disorder and others. However,
schizophrenia involves a relatively higher display of psychotic symptoms than
most other mental illnesses (Amin,
Agarwal & Beg, 2013).
Schizophrenia
is a chronic and debilitating illness characterized by perturbations in
cognition, affect and behavior, all of which have a bizarre aspect (Lehman et al., 2010). Due to the fact that
schizophrenia is a stigmatized illness it is important for schizophrenic
patients’ data to be kept with a high degree of secrecy so as to avoid
sensitive patient data being divulged. It is therefore expedient that in
Clinical Decision Support Systems that contain data of Schizophrenic patients,
access to patient data by the healthcare givers be restricted based on their
roles in the hospital. This can be achieved by employing access control. The
Mandatory Role-Based Access Control is a type of access control and can be
employed for such a study as this. To boost the security of a Health
Information System (HIS) through data privacy preservation, this study proposes
a model for implementing data privacy preservation in a HIS. This model would
help boost the security of the HIS in question through the restriction of
access of users to its database.
This
study proposesa Data Privacy
Preservation (DPP) model for HIS. In order to guarantee the secrecy of
sensitive patient data domiciled in a HIS, the study involved the development
of an application named Schizoapp which was used to instantiate the proposed DPP
model and effected data privacy by blocking attributes on a patient database
based on the MandatoryRole-Based Access Control (MAC) model which is used to
assign access rights to different categories of health professionals based on
their role in the hospital. The study also compared the use of the application (Schizoapp)
developed in this study for data privacy preservation with the machine learning
approach to data privacy preservation which employed the Random Forest Decision
Tree algorithm embedded in the WEKA software.
In
healthcare delivery, patients are required to share information with certain
categories of health personnel to facilitate correct diagnosis and to determine
appropriate treatment. However, patients would most of the time prefer their
sensitive information to be kept secret particularly from persons that need not
have access to such information especially in cases of health problems such as
schizophrenia as the disclosure of such private information may lead to social
stigma and discrimination. There have been cases where health personnel who by
virtue of their role ought not to have access to certain patient information
gained access to such information. Some of these health personnel cause harm to
the patient in question by divulging such details to other individuals thereby
jeopardizing the patient’s health. Hence, the healthcare system becomes the
worse for it as a number of patients may relapse to worse states they already
improved from and the retrogression in the patients’ health status will in the
long run take a toll on the healthcare system.
The
existing Data Privacy Preservation (DPP) models are designed for Clinical
Decision Support Systems with inadequate information available for DPP in
Health Information Systems (HIS) in Nigeria. This research, therefore focused
on the development of a model for Data Privacy Preservation (DPP) in HIS to
address this inadequacy.
The
main objective of this study is to propose and implement a DPP model for HIS.
The specific objectives are to:
1.
propose a model for DPPin HIS;
2.
develop a DPP application to instantiate
the model for DPPin HIS;
3.
implement DPP in a HISusing the
application developed in ii and
4.
evaluate the prototype application
developed for its efficiency
1.4 Methodology Overview
1. Major existing
DPP models were reviewed and grouped into three clusters from which the most
recent model in each cluster was selected. The three models chosen from the
clusters are (A Cloud-Based eHealth
Model for Privacy Preserving Data Integration by Dubovitskaya, Urovi,
Vasirani, Aberer and Schumacher(2015);
A
Data Privacy Preserving Model for a Clinical Decision Support System by
Deshmukh, Tijare and Sawalkar(2016) and A Privacy Preserving Data
Classification Model by Desale and Javheri(2016)). In the course of reviewing the models, the flaws in
each of the models were highlighted. Taking into consideration the flaws
identified in the models, a model for data privacy
preservation in an HIS was proposed. The proposed model consists of:
i.
a local database that contains the
health information of patients
ii.
the Random Forest Decision Tree (RFDT)
algorithm
iii.
an attribute blocking module that employs the
RFDT algorithm
iv.
an attribute unblocking module which
also uses the RFDT algorithm and
v.
a module for the computation of time
elapsed in unblocking attributes.
2. An application for data privacy
preserving in a Health Information System named Schizoapp was built using the
Waterfall Software Development Life Cycle Model and the following tools were
employed:
i.
Netbeans Integrated Development
Environment (IDE)
ii. MySQL server
iii. Java
Development Kit (JDK) 8
iv. Scenebuilder 2.0
v.
Navicat 8 query editor
vi.
JavaFX
3.
Using the Mandatory Role-Based Access Control, access to patient data as
regards the three sensitive attributes of the eleven attributes in the dataset
by the four categories of healthcare professionals considered in this study
(doctors, nurses, psychologists and social workers) is restricted such that
each category is only allowed to view the attribute(s) needed for them to
provide the service needed by the patient.
The attribute(s) of the three which each category is
not allowed to see is blocked in the database so that the health worker in that
category can only see the other attributes in order to ensure the preservation
of patient data privacy. Graphical User
interfaces were generated to depict the view of each healthcare professional to
patient data.
4. The machine learning approach to
data privacy that involved the use of the Random Forest Decision Tree algorithm
in the WEKA softwarefor DPP was compared with the application based approach
which employed the proposed DPP application. Both approaches were evaluated for
efficiency based on the quantum of time taken to unblock the attributes. Hence,
the better approach for data privacy preservation is the one which took a
longer time for the blocked attributes to be unblocked.
This study will bring to the fore, the need for
Psychiatric hospitals in Nigeria to adopt Electronic Health Records for patient
data rather than the present method used by most of them which employs paper
records for patient data. The study when implemented by the Psychiatric
hospitals in Nigeria will help mitigate the intrusion of patient data privacy
by restricting access to patient data only to the persons that are eligible to
view such information by virtue of their role as healthcare professionals needed
to keep the patient in a state of good health. By implementing the data privacy
preserving model that was proposed in this study, the menace of schizophrenic
patients in Nigeria being stigmatized based on their schizophrenic status would
be mitigated to a reasonable degree.
The study focused on preserving the privacy of data
belonging to some schizophrenic patients and some of the people that have been
interrogated by psychiatrists at one time or the other to ascertain if they
were schizophrenic or not. For the purpose of this study, two Psychiatric
hospitals were visited to gather the data required for the study. The two
hospitals were Federal Neuropsychiatric Hospital, Yaba, Lagos and
Neuropsychiatric Hospital, Aro, Abeokuta. Two hundred and sixty three
anonymous records of persons that have earlier visited Federal Neuropsychiatric
Hospital, Yaba on account of showing symptoms suggestive of schizophrenia and
Two hundred and forty eight anonymous records of persons that have visited Neuropsychiatric Hospital, Aro, Abeokuta earlier on account
of being linked with schizophrenic symptom(s) were gotten, giving a total of five hundred and eleven records,
which were used for the research. The study used five hundred and eleven
records due to the fact that this was the number of records both Psychiatric
Hospitals used in this study were willing to release.
1.7 Operational Definition of Terms
================================================================
Health
Information System: This refers to any system
that captures, stores, manages or transmits information related to the health
of individuals or the activities of organizations that work within the health sector.
Model: This is a representation of an
idea, an object or even a process or a system that is used to describe and
explain phenomena that cannot be experienced directly.
Data Privacy: This deals with the
ability an organization has to determine what data in a Health Information
Systemcan be shared with health personnel.
Mandatory Access Control: Thisrefers to
a type of access control by
which the operating system constrains the ability of a subject or initiator to access or generally perform some sort
of operation on an object or target.
Role Based Access Control: Thisis a
policy neutral access control
mechanism defined around roles and privileges used to restrict
system access to authorized
users.
Decision Tree is a flow-chart-like
structure, where each internal (non-leaf) node denotes a test on an attribute,
each branch represents the outcome of a test, and each leaf (or terminal) node
holds a class label.
Random Forests are an ensemble learning
method for classification, regression and other tasks, that operate by
constructing a multitude of decision
trees at training time and outputting the class that is the mode of the
classes (classification) or mean
prediction (regression) of theindividual trees.
Machine Learning is a type of
artificial intelligence (AI) that provides computers with the ability to learn
without being explicitly programmed.
================================================================
Item Type: Postgraduate Material | Attribute: 169 pages | Chapters: 1-5
Format: MS Word | Price: N3,000 | Delivery: Within 30Mins.
================================================================
No comments:
Post a Comment