Page 33 - AIH-1-4
P. 33
Artificial Intelligence in Health Optimized clustering in medical app detection
computationally less intensive and reduce the risk of refinement contributes to an overall reduction in app
overfitting, which is crucial given our limited sample size. detection errors. By optimizing K in K-means and the
However, we acknowledge that more complex models number of nodes in ANN, substantial improvements in
could potentially yield better performance and will results are attainable.
consider this in future work. To further enhance the effectiveness of health-care app
5.3. Limitations detection, future work should explore several avenues.
Real-time data capture could improve classification
One significant limitation of our study is the small sample accuracy in dynamic environments. Investigating
size of 20 medical apps and malware samples, along with advanced feature selection algorithms holds promise for
10 unknown benign apps and malware samples. This small achieving greater accuracy in app detection. In addition,
sample size limits the generalizability of our findings. In the incorporation of weighted sampling techniques in
future work, we plan to expand our dataset to include a training flows may provide more representative and
more diverse and larger set of samples to validate our effective models. The obtained results suggest that
results further. substantial improvements in the performance of health-
Another limitation is the lack of direct comparison care app detection are feasible. Future studies should
with existing state-of-the-art methods for app detection. focus on finding an optimal method for determining
While we conducted a comprehensive literature review the number of clusters, a critical aspect of refining the
to understand the current landscape, direct empirical proposed scheme. In addition, extending the study
comparisons are necessary to validate the effectiveness of to encompass diverse health-care scenarios and data
our approach rigorously. We aim to address this in future sources would enhance the robustness and applicability
studies by benchmarking our method against established of the proposed detection model. Collecting more data is
techniques using larger and more diverse datasets. essential to strengthening the conclusions and reliability
of the proposed methods.
While our proposed method demonstrates promising
results in terms of intracluster similarity and error Acknowledgments
reduction, further research with larger datasets and more
complex models is needed to fully validate its effectiveness None.
and generalizability.
Funding
6. Conclusion None.
The paper has successfully addressed the challenge of
detecting zero-day health-care apps, a prevalent issue Conflict of interest
where conventional app detection techniques struggle The authors declare that they have no competing interests.
with misclassifying zero-day traffic into predefined
known classes. Our approach proposes a scheme that Author contributions
can identify zero-day apps while accurately classifying Conceptualization: Ciza Thomas
those belonging to predefined application classes. The Formal Analysis: Ciza Thomas
proposed scheme encompasses three crucial modules: Investigation: Ciza Thomas
unknown discovery, app classification, and system Methodology: All authors
update. By leveraging ANNs to determine centroids in Writing – original draft: All authors
K-means clustering, our study reveals that the hybrid Writing – review & editing: All authors
model of K-means clustering using ANN enhances app
detection, particularly for zero-day apps. We highlight Ethics approval and consent to participate
the impact of unknown apps on the classification
accuracy of supervised methods, validating the Not applicable.
effectiveness of correlation-based feature selection for Consent for publication
clustering essential features. With a focus on unknown
discovery and (N + 1) class classification, the proposed Not applicable.
model efficiently identifies zero-day traffic and undergoes Availability of data
frequent updates through training with zero-day apps
within the respective cluster. This continuous model Not applicable.
Volume 1 Issue 4 (2024) 27 doi: 10.36922/aih.2585

